Skip to content

gumption/A_Few_Optimizations_for_Data_Analysis_in_Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commits
 
 
 
 

Repository files navigation

I read Karolina Alexiou's excellent blog post about The Top Mistakes Developers Make When Using Python for Big Data Analytics with great interest. I have made - and partially learned from - all of the mistakes she warned about. I was particularly eager to try out and extend some of the code snippets she provided to illustrate 2 of the mistakes:

  • Mistake #1: Reinventing the wheel
  • Mistake #2: Not tuning for performance

I started composing a rather lengthy comment on the blog post, highlighting some aspects I especially appreciated and seeking clarification on others. Whenever I notice myself getting a bit voluminous in a comment on someone else's blog, I typically compose a separate post on my own blog (Gumption), and then substitute a link (with a brief summary) on the original blog post.

In this instance, it seemed more appropriate - and constructive - to create an IPython Notebook to illustrate and/or investigate some of the issues I was raising in that comment ... and thereby finding some of the clarifications I was initially seeking.

I am sharing those investigations here in case they are of interest or use to others ... and because it's been a while since I created and shared an IPython Notebook about Python and data science.

About

A few Python optimizations for data analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published