-
Notifications
You must be signed in to change notification settings - Fork 59
Closed
Description
Notes from Bordeaux workshop:
-
pandas intro notebooks can use a re-work:
- data structures -> check the pandas-tutorial ones and compare (first dataframe, then series?)
- data structures -> already do exercise with titanic? (loading the data, seeing the first 5 rows, plotting one column?)
- basic operations ->
countries["capital"].apply(lambda x: len(x))is bad example, as doingapply(len)is the same - basic operations -> exercises with titanic ?
- basic operations -> leave out alignment example? It only adds cognitive load, and does not occur in the examples (that's maybe for in the "advanced indexing" one?)
-
pandas indexing / selecting data:
- use titanic for exercises (instead of countries) ?
- split in two parts? basic (selecting colum(s) + filtering rows) and advanced (actual non-default index, loc/iloc, assignment, index/multi-index)
check: do we use loc/iloc in some of the case studies? (I assume we might use it for a combined boolean mask + column selection)
-
new notebook: working with missing data ?
-
reshaping: too much ? (only unstack and not stack?)
-
bike_count case study:
- don't use dayfirst as the solution, only show the comparison
- more hints (eg "use
value_counts", since they haven't seen that function yet)
maybe work with hidden hint html? - drop_duplicates -> does not take into account the index!
-
matplotlib visualization notebook:
- ax -> ax1
- set_axis_bgcolor no longer exists (
g.ax_joint.set_axis_bgcolor('0.1')in one of the seaborn examples)
-
general: more hints on what to use
-
provide good cheatsheet
python intro: too little too fast if you don't know python yet, boring / too slow if you already know Python
Metadata
Metadata
Assignees
Labels
No labels