The following gist offers a focus on Data Collection, one of the stages* of the Data Science methodology. We will also perform basic math operations on a single dataframe to see how they render using Python or R.
I used no vesioning system for this gist, which repos status is flagged as concept because it is intended to be a demo or POC (proof-of-concept).
- Isaac Arnault - Suggesting two implementations in
Python
andR
, from Initial work Cognitive Class Lab - Module 2 and providing one exercise.
All public gists https://gist.github.com/aiPhD
Copyright 2018, Isaac Arnault
MIT License, http://www.opensource.org/licenses/mit-license.php
- Figure appended in architecture.md, inspired by Cognitiveclass.ai.
- Dataframe used as sample coming from Spatialkey.com.
- Perform a data collection in
Python
andR
usingJupyter
.
⇢ Use the following dataframe from Spatialkey.com. - How many observations and variables does the dataframe contain? Base your assessment on your scripting outputs.
- Calculate Sum, Min, Max and Mean of variable "raisedAmt" using Python (and Pandas) and using R.
— (*) Ten stages are crucial regarding Data Science methodology, among which Data collection. See architecture.md.