Skip to content

emma-oc/ds-class-intro

Repository files navigation

ds-class-intro

Welcome to the intro-level DS class where we will learn about python basics and how to use python for exploratory data analysis. Hope you'll enjoy the class and learn something from it.

0. Get started

1. Python basics

You can run python in different settings, for example, you can use jupyter notebook for interactive exploration, use interpreter in command line by typing python in terminal (you'll see >>> prompt appear), or run python script in command line by python <your_script>.py. We will be using notebooks for the class as it's easy to follow with markdown and easy to interact with.

class01:

0. Environment set up (material in section 0)
1. Assign values to variables and simple arithmetics
2. `Print` and simple string manimulation

class02:

3. Value comparison and conditions using `if-elif-else`
4. Collections: list, tuple, set, and dictionary
*  Git - Commiting, Pushing, and Pull Request

Homework_01(Exercise0,3,4) is due next class. Please refer to homework submission instructions for how to open pull request for submission.

class03:

*  HW01 review
5. Iteration: loops and comprehensions

class04

5. Iteration: loops and comprehensions
6. Writing functions

class05

6. Writing functions
7. Reading and writing files

Homework_02 (Exercise 5, 6, 7) is assigned, it's due next Wednesday 5/6.

This time, please submit the .py files for all submissions. Similarly, once you're done, you can open a PR with these files.

class06

8. Intro to code performance ~~Useful basic modules (numpy, os, datetime)~~
9. Coding challenge examples on HackerRank

2. Data manipulation using pandas

class07

1. Intro to `pandas` 
2. Data wrangling

class08

3. Using `pandas` for EDA

Homework_03 is assigned, it's due next Wednesday 5/13 so we can spend some time on discussion.

Please spend some time to work on EDA dataset so we can have a good discussion session next week.

class09

4. Basic plotting + (slightly) advanced EDA topics
*  HW3 review

class10

5. Mock Take-home case study
    - HackerRank coding test
    - Dataset exploration
6. Demo of `pandas-profiling`
7. Discussion on A/B test

An intro on A/B testing: https://towardsdatascience.com/the-math-behind-a-b-testing-with-example-code-part-1-of-2-7be752e1d06f

3. Python intermediate (if time permits)

0. simple scripting
1. Introduction to `class`
2. Tests and others

Resources:

Note:
  • Please try to follow and read the provide the material to make sure we can cover more stuff during class.
  • Please be respectiful of your own time and commit to as many of the assignments as possible:)
  • The internet (primarily Stackoverflow) is your friend if you have questions - you won't be the first of the last with this question. Try to do a quick Google search and see you can find existing solutions.
Ref:

https://github.com/tdpetrou/Minimally-Sufficient-Pandas

https://github.com/cmawer/pycon-2017-eda-tutorial/blob/master/EDA-cheat-sheet.md

https://github.com/Tian-Su/Walmart_MI_ML_interview_campus

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published