-
Notifications
You must be signed in to change notification settings - Fork 30
Python Notes
fcrimins edited this page Jan 30, 2017
·
21 revisions
Pandas Cheat Sheet (1/30/17)
- "Let's consider two definitions of "good code" so we can be clear what we mean by better.
- Code that is short, concise, and can be written quickly
- Code that is maintainable
- If we're using the first definition, the Python version is "better". If we're using the second, it's far, far worse."
- "I've painted a rather bleak picture of using Python to manipulate complex (and even not-so-complex) data structures in a maintainable way. In truth, however, it's a shortcoming shared by most dynamic languages. In the second half of this article, I'll describe what various people/companies are doing about it, from simple things like the movement towards 'live data in the editor' all the way to the Dropboxian 'type-annotate all the things' (Static Typing in Python). In short, there's a lot of interesting work going on in this space and lot's of people are involved (notice the second presenter name [Guido] in that Dropbox deck)."
- Good overview of the tools and IPython Notebook
Why Python is Slow (7/5/16)
- it's the C code that's slow, not the JIT interpreter
- Good explanation of
asyncandawaitkeywords introduced in Python 3.5 (similar tosynchronizedand Future in Java)
Jamal Moir: An Introduction to Scientific Python (and a Bit of the Maths Behind It) - Matplotlib (4/28/16)
- bottom line: use scipy
How does Python compare to C#? (1/11/16)
- Includes an intro to Pandas, Matplotlib, and Scikit-Learn
Probability, Paradox, and the Reasonable Person Principle (in iPython Notebook, by Peter Norvig)
- To summarize in terms of best performance at summing a list, NumPy ndarray
sum> pandas Seriessum> standard librarysum> for loop > standard libraryreduce. - DataFrame methods for aggregation and grouping are typically faster to run and write than the equivalent standard library implementation with loops. For instances where performance is a serious consideration, NumPy ndarray methods offer as much as one order of magnitude increases in speed over DataFrame methods and the standard library.
- Out-of-Core Dataframes in Python: Dask and OpenStreetMap https://jakevdp.github.io/blog/2015/08/14/out-of-core-dataframes-in-python/