GitHub - kjam/data-wrangling-video: Code and examples for O'Reilly's Data Wrangling with Python video course

Data Wrangling with Python (video edition)

Welcome to the code repository for Data Wrangling with Python! If you have any questions reach out to @kjam on Twitter or GitHub.

Code Structure

Most of the code covered in the videos is here; but not all of it. I highly recommend you take time to type out all the code along with the videos and simply use these scripts to "double check" or remind yourself of the work you've already completed.

Data folder

Although I don't recommend keeping your data in a repository, I've done so here for the purpose of our shared use. In the data directory you'll find all of the examples used for the video series. You'll also find a copy of some example API responses and web pages in case these change after the video is available. If you find one of the scripts handling API or web data doesn't work, you can use the files here by accessing them via the File URI (normally file://file_name.html ).

Installation

If you are using Python2, use the requirements.txt file. If you are using Python3, use the py3_requirements.txt file.

pip install -r requirements.txt

or

pip install -r py3_requirements.txt

Python2 v. Python3

This repository is primarily compliant for both versions. There is a problem with PDFMiner and pdf_tables being non-Python3 compliant. I have begun some investigation into their portability, but my current advice is, if you are using Python3, simply switch to Python2 for just your PDF wrangling, export that into a form you can read (like a database or file) and then switch back. :)

Corrections?

If you find any issues in these code examples, feel free to submit an Issue or Pull Request. I appreciate your input!

Questions?

Reach out to @kjam on Twitter or GitHub. @kjam is also often on freenode. :)

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
actual_video_notebooks		actual_video_notebooks
data		data
.gitignore		.gitignore
0301 - Data Structures.ipynb		0301 - Data Structures.ipynb
0302 - Data Types.ipynb		0302 - Data Types.ipynb
0303 - Filtering Datasets.ipynb		0303 - Filtering Datasets.ipynb
0304 - Concatenating Datasets.ipynb		0304 - Concatenating Datasets.ipynb
0305 - Joining Datasets.ipynb		0305 - Joining Datasets.ipynb
0306 - Split Apply Combine.ipynb		0306 - Split Apply Combine.ipynb
0306 - Standardizing and Normalizing with Pandas.ipynb		0306 - Standardizing and Normalizing with Pandas.ipynb
0307 - Simple Statistics with Pandas.ipynb		0307 - Simple Statistics with Pandas.ipynb
0401 - Identifying "Bad" Data.ipynb		0401 - Identifying "Bad" Data.ipynb
0402 - Regex Parsing.ipynb		0402 - Regex Parsing.ipynb
0403 - Fuzzy Matching.ipynb		0403 - Fuzzy Matching.ipynb
0404 - Storing Your Data.ipynb		0404 - Storing Your Data.ipynb
0501 - Outliers and Trends.ipynb		0501 - Outliers and Trends.ipynb
0503 - Writing Performent Code.ipynb		0503 - Writing Performent Code.ipynb
0504 - Parallelizing.ipynb		0504 - Parallelizing.ipynb
0601 - Natural Language Processing.ipynb		0601 - Natural Language Processing.ipynb
0602 - Scipy and Numpy.ipynb		0602 - Scipy and Numpy.ipynb
0603 - Data Visualization.ipynb		0603 - Data Visualization.ipynb
Exports & Imports Transformation.ipynb		Exports & Imports Transformation.ipynb
README.md		README.md
chp2_basic_files.py		chp2_basic_files.py
chp2_excel_files.py		chp2_excel_files.py
chp2_pdf_files.py		chp2_pdf_files.py
chp2_tweepy_api.py		chp2_tweepy_api.py
chp2_weather_api.py		chp2_weather_api.py
chp2_web_scraping.py		chp2_web_scraping.py
example_conf.cfg		example_conf.cfg
multiprocessing_example.py		multiprocessing_example.py
my_startup.py		my_startup.py
py3_requirements.txt		py3_requirements.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Wrangling with Python (video edition)

Code Structure

Data folder

Installation

Python2 v. Python3

Corrections?

Questions?

About

Releases

Packages

Languages

kjam/data-wrangling-video

Folders and files

Latest commit

History

Repository files navigation

Data Wrangling with Python (video edition)

Code Structure

Data folder

Installation

Python2 v. Python3

Corrections?

Questions?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages