In this project I solve the following three quizzes of data science:
Create a list of numbers with the following properties: 1) Minimum 100 distinct values, 2) the Mean of all values is 1000 (+/- 0.5), 3) the standard Deviation is 10 (+/- 0.1)
Implement a solution to the following problem
Imagine there is a country in which couples only want boys. Couples continue to have children until they have their first boy. If they get a boy, they stop getting children. What is the long-term ratio of boys to girls in the country?
Using the data set http://archive.ics.uci.edu/ml/datasets/Adult:
- Explore the data and visualize and explain two interesting findings of your choice
- Build a linear model to predict wether a person makes over 50K a year
The work on each of these questions is contained in: quiz1.py
,quiz2.py
and quiz3.py
.
The development and discussion of the 3 quizzes are present in the notebook: quizzes.ipynb
.
It can also be viewed using this viewer: http://nbviewer.jupyter.org/github/lorrandal/data_science_quizzes/blob/master/quizzes.ipynb
Data
folder contains the dataset used for quiz 3. plots
folder contains the plots from quiz 3.