Skip to content

Notes for a course in data science using Python. Work in progress

License

Notifications You must be signed in to change notification settings

matteocourthoud/Data-Science-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science with Python

⚠️ WORK IN PROGRESS! ⚠️

Welcome to my Data Science with Python course! I am happy to share my work and I am even happier if it can be useful.

Content

  1. Data Structures
    • Lists
    • Tuples
    • Sets
    • Dictionaries
    • Numpy arrays
    • Pandas DataFrames
    • Pyspark DataFrames
  2. Data Exploration
    • Import, export data
    • Descriprives and summary statistics
    • Pivot tables and aggregation
  3. Data Types
    • Numerical data
    • String data
    • Time data
    • Missing data
  4. Data Wrangling
    • Rows: sorting, indexing, ....
    • Columns: renaming, ordering, ....
    • Collapse and aggregate
    • Reshape
    • Concatenate and merge
  5. Plotting
    • Distributions
    • Time Series
    • Correlations
    • Regression
    • Geographical data
  6. Machine Learning Pipeline
    • Data exploration
    • Encoding and normalization
    • Missing values
    • Weighting
    • Prediction
    • Cross-validation
  7. Web Scraping
    • Pandas
    • APIs
    • Static Webscraping
    • Dynamic Webscraping
  8. TBD
    • What is missing? Let me know!

Contacts

Please, if you find any typos or mistakes, open a new issue. Or even better, fork the repo and submit a pull request. All feedback is greatly appreciated!

About

Notes for a course in data science using Python. Work in progress

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages