## How to Start in Data Science with Python
# [github.com/aymanibrahim/pyds](https://github.com/aymanibrahim/pyds)

1. Visit:
https://github.com/aymanibrahim/pyds
2. Press "launch binder" button
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/aymanibrahim/pyds/master)

3. Wait until the server is launched, then navigate to:
>presentations > start_data_science.ipynb
4. Press [Alt + R] to start the RISE presentation.

## Overview

- What is Data Science 
- Data Science Learning Path
- Python for Data Science 
- Data Analysis
- Data Visualization
- Q & A

## How to Start in Data Science with Python

### Ayman Ibrahim 
[GitHub](https://github.com/aymanibrahim) [LinkedIn](https://www.linkedin.com/in/aymanibrahim/) [Kaggle](https://www.kaggle.com/aymani) [Twitter](https://twitter.com/AymanIbrahim) [Facebook](https://www.facebook.com/ayman.ibrahim.awad)

# What is Data Science?

<p align="center"> 
<img src="../images/Data_Science_VD.png", width=400, height=300>
</p>

#### [The Data Science Venn Diagram](http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram)

<p align="center"> 
<img src="../images/Applied_Data_Science_with_Python.png", width=200, height=150>
</p>

#### IBM Cognitive Class Learning Path 
#### [Applied Data Science with Python](https://cognitiveclass.ai/badges/applied-data-science-python)

# Python for Data Science  
> Learn how to create your first Python scripts and perform basic hands-on data analysis using Jupyter-based environment.

[github.com/aymanibrahim/pyds](https://github.com/aymanibrahim/pyds)

![](../images/pyds-1.png)

![](../images/pyds-2.png)

# [01 Basics](https://github.com/aymanibrahim/pyds/blob/master/notebooks/01_Basics.ipynb)
Types            |  Variables | Strings
:-------------------------:|:-------------------------:|:-------------------------:
![](../images/01_Basics/02_types/02_get_type.png)  |  ![](../images/01_Basics/04_variables/07_convert_min_num_hr_num.png)|  ![](../images/01_Basics/05_strings/18_upper.png)

- [x] Hello World
- [ ] Comments
- [ ] Errors
- [ ] Types
- [ ] Expressions
- [ ] Variables
- [ ] Strings

# [02 Data Structures](https://github.com/aymanibrahim/pyds/blob/master/notebooks/02_Data_Structures.ipynb)
Tuple                      |  Set                      | Dictionary
:-------------------------:|:-------------------------:|:-------------------------:
![](../images/02_Data_Structures/01_Tuples/01_create_tuple.png)  |  ![](../images/02_Data_Structures/01_Tuples/01_create_tuple.png)|  ![](../images/02_Data_Structures/04_Dictionaries/01_dictionary.png)

- [ ] Tuples
- [ ] Lists
- [ ] Sets
- [ ] Dictionaries

# [03 Fundamentals](https://github.com/aymanibrahim/pyds/blob/master/notebooks/03_Fundamentals.ipynb)
Condition                      |  Loop                      | Class
:-------------------------:|:-------------------------:|:-------------------------:
![](../images/03_Fundamentals/01_Conditions/05_equality_True.png)  |  ![](../images/03_Fundamentals/03_Loops/12_for_enumerate_2.png)|  ![](../images/03_Fundamentals/06_Classes/02_attributes.png)

- [ ] Conditions 
- [ ] Branching
- [ ] Loops
- [ ] Functions
- [ ] Objects 
- [ ] Classes

# [04 Working with Data](https://github.com/aymanibrahim/pyds/blob/master/notebooks/04_Working_with_Data.ipynb)
Read file                       |  pandas DataFrame                     | Specify columns 
:-------------------------:|:-------------------------:|:-------------------------:
![](../images/04_Data/01_Read_files/11_loop_print_each_line.png)  |  ![](../images/04_Data/03_Load_pandas/03_rows_columns.png)|  ![](../images/04_Data/04_Save_pandas/05_select_specified_columns_2.png)

- [ ] Reading files with open
- [ ] Writing files with open
- [ ] Loading data with Pandas
- [ ] Working with and Saving data with Pandas

# [05 Arrays](https://github.com/aymanibrahim/pyds/blob/master/notebooks/05_Arrays.ipynb)
1D Array                      |  2D Array                      | Array slicing
:-------------------------:|:-------------------------:|:-------------------------:
![](../images/05_Arrays/01_Numpy_1D/01_create_1_d_array.png)  |  ![](../images/05_Arrays/02_Numpy_2D/04_size.png)|  ![](../images/05_Arrays/02_Numpy_2D/07_slicing.png)

- [ ] Creating and Manipulating 1D & 2D Arrays
- [ ] Array Operations

# Data Analysis
>Learn how to analyze data using multi-dimensional arrays in NumPy and manipulate DataFrames in pandas using Jupyter-based environment.

[github.com/aymanibrahim/dapy](https://github.com/aymanibrahim/dapy)

![](../images/start_ds/dapy-1.png)

![](../images/start_ds/dapy-2.png)

# 01 Intro
Problem            |  Attributes | Types
:-------------------------:|:-------------------------:|:-------------------------:
![](../images/start_ds/01_Intro/01_problem/01_problem.png)  |  ![](../images/start_ds/01_Intro/02_data_analysis/05_attributes.png)|  ![](../images/start_ds/01_Intro/05_insights/01_types.png)

- Understanding the Domain
- Understanding the Dataset
- Python package for data science
- Importing and Exporting Data in Python
- Basic Insights from Datasets

# 02 Data Wrangling
Distribution            |  Bins | Histogram
:-------------------------:|:-------------------------:|:-------------------------:
![](../figs/start_ds/02_Wrangling/horsepower_distribution.png)  |  ![](../figs/start_ds/02_Wrangling/horsepower_bins.png)|  ![](../figs/start_ds/02_Wrangling/horsepower_histogram.png)

- Identify and Handle Missing Values
- Data Formatting
- Data Normalization 
- Binning
- Indicator variables

# 03 EDA
Heatmap            |  Scatterplot | Boxplot
:-------------------------:|:-------------------------:|:-------------------------:
![](../figs/start_ds/03_EDA/05_pearson/correlation_heatmap.png)  |  ![](../figs/start_ds/03_EDA/05_pearson/positive_correlation_engine_size_price_scatterplot.png)|  ![](../figs/start_ds/03_EDA/01_descriptive/drive_wheels_price_box_plot.png)

- Descriptive Statistics
- Basics of Grouping
- ANOVA
- Correlation

# 04 Model Development
3rd Polynomial            |  Actual/Fitted | 11th Polynomial
:-------------------------:|:-------------------------:|:-------------------------:
![](../figs/start_ds/04_Model_Development/03_polyreg/polynomial_fit_price_highway-mpg_3_order.png)  |  ![](../figs/start_ds/04_Model_Development/02_visualize/actual_fitted_values_distplot.png)|  ![](../figs/start_ds/04_Model_Development/03_polyreg/polynomial_fit_price_highway-mpg_11_order.png)

- Simple and Multiple Linear Regression
- Model Evaluation Using Visualization
- Polynomial Regression and Pipelines
- R-squared and MSE for In-Sample Evaluation
- Prediction and Decision Making

# 05 Model Evaluation
5th Polynomial            |  R^2 | 4 Features
:-------------------------:|:-------------------------:|:-------------------------:
![](../figs/start_ds/05_Model_Evaluation/02_select/degree5_poly_horse_power.png)  |  ![](../figs/start_ds/05_Model_Evaluation/02_select/r_squared_order_horsepower.png)|  ![](../figs/start_ds/05_Model_Evaluation/02_select/distribution_plot_of_predicted_value_using_test_data_vs_data_distribution_of_test_data_with_4_features__distplot.png)

- Model Evaluation
- Over-fitting, Under-fitting and Model Selection
- Ridge Regression
- Grid Search

# Data Visualization
>Learn how to represent data graphically to convey insights to clients, customers, and stakeholders

[github.com/aymanibrahim/dvpy](https://github.com/aymanibrahim/dvpy)

![](../images/start_ds/dvpy-1.png)

![](../images/start_ds/dvpy-2.png)

# [01 Intro](https://github.com/aymanibrahim/dvpy/blob/master/notebooks/01_Intro.ipynb)
Line Plot            |  Trend | Top 5
:-------------------------:|:-------------------------:|:-------------------------:
![](../figs/start_ds/01_Intro/immigration_from_haiti.png)  |  ![](../figs/start_ds/01_Intro/immigration_from_china_india.png)|  ![](../figs/start_ds/01_Intro/immigration_trend_top5_countries.png)


- Introduction to Data Visualization
- Introduction to Matplotlib
- Basic Plotting with Matplotlib
- Dataset on Immigration to Canada
- Line Plots

# [02 Basic Visualization](https://github.com/aymanibrahim/dvpy/blob/master/notebooks/02_Basic.ipynb)
Bar Chart            |  Histogram | Area Plot
:-------------------------:|:-------------------------:|:-------------------------:
![](../figs/start_ds/02_Basic/top15_immigration_countries_barh.png)  |  ![](../figs/start_ds/02_Basic/immigration_from_195_countries_histogram.png)|  ![](../figs/start_ds/02_Basic/immigration_trend_top5_countries_modified_alpha.png)


- Area Plots
- Histograms
- Bar Charts

# [03 Specialized Visualization](https://github.com/aymanibrahim/dvpy/blob/master/notebooks/03_Specialized.ipynb)
Pie Chart            |  Box Plot | Bubble Plot
:-------------------------:|:-------------------------:|:-------------------------:
![](../figs/start_ds/03_Specialized/immigration_by_continent_2013_pie.png)  |  ![](../figs/start_ds/03_Specialized/immigration_china_india_boxplot.png)|  ![](../figs/start_ds/03_Specialized/immigration_from_brazil_argentina_bubble.png)


- Pie Charts
- Box Plots
- Scatter Plots
- Bubble Plots

# [04 Advanced Visualization](https://github.com/aymanibrahim/dvpy/blob/master/notebooks/04_Model_Development.ipynb)
Waffle Chart            |  Word Cloud | Regression Plot
:-------------------------:|:-------------------------:|:-------------------------:
![](../figs/start_ds/04_Advanced/immigration_from_denmark_norway_sweden_waffle_chart.png)  |  ![](../figs/start_ds/04_Advanced/top_wordcloud.png)|  ![](../figs/start_ds/04_Advanced/immigration_best_fit_white_seaborn.png)

- Waffle Charts
- Word Clouds
- Seaborn and Regression Plots

# [05 Maps and Geospatial Data](https://github.com/aymanibrahim/dvpy/blob/master/notebooks/05_Maps.ipynb)
Choropleth Map             | Clusters Map 
:-------------------------:|:-------------------------:|
![](../figs/start_ds/05_Maps/immigration_choropleth_map.png)  |  ![](../figs/start_ds/05_Maps/incidents_sanfran_map_clusters.png)|  


- Introduction to Folium
- Maps with Markers
- Choropleth Maps

## Mathematics

<p align="center"> 
<img src="../images/linear_algebra_machine_learning.jpg", width=400, height=300>
</p>

#### [Mathematics for Machine Learning: Linear Algebra - Coursera](https://www.coursera.org/learn/linear-algebra-machine-learning)

<p align="center"> 
<img src="../images/multivariate_calculus_machine_learning.jpg", width=400, height=300>
</p>

#### [Mathematics for Machine Learning: Multivariate Calculus - Coursera](https://www.coursera.org/learn/multivariate-calculus-machine-learning)

## Statistics

<p align="center"> 
<img src="../images/intro_to_statistics.jpeg", width=400, height=300>
</p>

#### [Intro to Statistics - Udacity](https://www.udacity.com/course/intro-to-statistics--st101)

## Deep Learing

<p align="center"> 
<img src="../images/ml-tf-gcp.jpg", width=400, height=300>
</p>

#### [Machine Learning with TensorFlow on Google Cloud Platform Specialization - Coursera](https://www.coursera.org/specializations/machine-learning-tensorflow-gcp)

<p align="center"> 
<img src="../images/Practical Deep Learning for Coders.png", width=400, height=300>
</p>

#### [Practical Deep Learning for Coders - fast.ai](https://course.fast.ai)
 

## Data

<p align="center"> 
<img src="../images/open_data.jpeg", width=400, height=300>
</p>

#### [Open Data Sources](https://www.freecodecamp.org/news/https-medium-freecodecamp-org-best-free-open-data-sources-anyone-can-use-a65b514b0f2d)
 

<p align="center"> 
<img src="../images/Kaggle.png", width=400, height=300>
</p>

#### [kaggle](https://www.kaggle.com)
 

## Q & A