Goals:
- Develop good SEO content and authority on Pandas. For this we'll create several more articles on aspects of Pandas than we have now, and start grouping them in a category
- Develop a five day mini-course as an offering for email subscribers
- Improve the practice question list https://codesolid.com/pandas-practice-examples/. Offer the solution set as part of course.
-
Pandas introduction (John)
- Core classes - DataFrames and Series.
- See next topic, I think this is easier to understand with datasets, but many authors focus on creating from dictionaries of lists, etc.
- Using the tools
-
"Pandas DataSets" - perhaps one article covering the following:
- Downloading and unzipping arbitrary file using urllib or requests, plus python zipfile
- Seaborn load_dataset and get_dataset_names.
- https://scikit-learn.org/stable/datasets/toy_dataset.html
-
"Kaggle Datasets" (Full article). On using Kaggle API to download datasets: https://www.kaggle.com/docs/api#interacting-with-datasets
-
Selecting data in pandas (Beginner to Expert)
- Relation to indexing.
- See Indexing and Selecting data
- loc. iloc. Others? SQL Article I have in progress already.
- Multi-indexing.
-
Data cleaning (one article?)
- Filling in / handling missing data
- sklearn has tools for this too?
- removing duplicates
-
Data Visualization in Pandas (Bashir) 2000-3000
Data transformation: * Vectorized string methods / other string techniques
-
Grouping data (already have GroupBy article. See below. Anything else?)
-
Pivot tables and cross-tabulation A lot of this in McKinney's book under "Data Wrangling: Join / Combine / Reshape". So:
- Dataframe.combine
- Dataframe.merge
- stack and unstack
-
Time series data Pandas - A whole chapter in McKinney. Several articles possible here?
- Time deltas
- Windowing functions?
- Other material, see for example Pandas Time Series / Date Functionality
-
- Pandas loading dataframe from various types (this has been done a lot)
JupyterHub + AWS
- Large Datasets in Pandas
- Using SQL with Pandas (in progress)
- How to Use the Pandas GroupBy Method
- Pandas Practice Examples
- How to Work With Google Sheets in Python and Pandas
- Matplotlib vs. Seaborn (Other plotting articles -- relate to this)