My Somewhat Awesome List
- Pandas
- Knowledge & Learning Resources
- JoyPlots w/ Python
- Networks
- EDA
- Data Visualization
- Data Exploration, Visualization and Dashboarding
- SQL
- Python
- Deep Learning
- Machine Learning
- Practice Questions
- A/B testing
- Probability & Statistics
- Raspberry Pi
- Git and Github
- Tools
- Cool Projects
- Misc
- Optimizing Pandas Code for Speed and Efficiency - Sofia Heisler. Video, Repo, Slides
- StackedOverflow - frequent questions in Pandas (interesting threads)
- Styling DataFrames like Excel Cells. Because, reasons
- Full book for download: Networks, Crowds, and Markets: Reasoning About a Highly Connected World by David Easley and Jon Kleinberg.
- Lada Adamic - SNA course materials (Coursera). (github repo)
- Dash by plot.ly. Also see github. "Dash is a Python framework for building analytical web applications.".
- BeakerX, try for notebook like experience for SQL.
- franchise - an open-source notebook for sql. SQLite, MySQL, PostgreSQL, BigQuery. Github.
- Superset (airbnb). Github.
- dbtools - simple python interface to SQLite databases.
- Thread on hackernews with links in comments to visual/browser platforms to run SQL queries and explore data.
- --> Mode's SQL tutorials
- --> Periscope SQL tutorials
- SQL Window Functions Tutorial for Business Analysis
- Set Yourself Apart from you Peers — Learn CTE’s
- ElephantSQL - PostgreSQL hosting, including free plan for testing
- Sample SQL sandbox(es) to practice online - in browser:
- How to query a CSV like SQL:
- From the CLI with the
querycsv
python library. Tried and worked! (pip install querycsv
) - Thread all sorts of other ways.
- Online tool.
- From the CLI with the
- Time Complexity of Operations on Different Data Structures
- Quick reference/cheat sheets:
- Datetime strftime strptime formatting
- 2–6x speed-up on your pre-processing with 3 lines of code (uses
concurrent.futures
)
- Determining the number of clusters in a data set - wikipedia
- UC Berkeley CS188 Intro to AI -- Course Materials
-
Data Science/Analysis:
-
Python:
-
SQL:
-
Data Structures:
-
Probability & Statistics
- Installing PhantomJS: tutorial (search for the PhantomJS bit), and a github repo to look for a release (you'll have to change the link in the tutorial accordingly). Official Download.
- RPi headlesss Firefox - link.
- Crontab for noobs link. More here. essentially:
timedefinitions python fullpath\script.py
. Full path is case sensitive so just copy it from the file manager. - Booting Raspsberry Pi Without Monitor link.
- Benchmarking the Pi, also good for testing different modes of heat dissipation.
- Benchmarking performance. link.
Essentially:apt-get install sysbench
sysbench --test=cpu --cpu-max-prime=20000 run - VNC - direct and clout connection set up - link. Download VNC Viewer from realvnc. Helpful YouTube vid.
- PiCamera:
- Playing video on Rpi: in terminal, navigate to the directory and
omxplayer filename.mp4
, or.h264
etc. - Thingspeak - send sensor data to the cloud, analyze, visualize.
- Turn monitor on and off with PIR sensor
- Physical computing:
- Cool projects:
- Evan Miller - How Not To Run an A/B Test
- Evan Miller - Formulas for Bayesian A/B Testing
- A/B Testing Tech Note: determining sample size
- Read: A Smart Bear - Easy statistics for AdWords A/B testing, and hamsters
- Video: Jason Cohen - a smart bear - mistakes to avoid with A/B testing, good watch
- 12 A/B Split Testing Mistakes I See Businesses Make All The Time
- Interesting and Important - Simpson's Paradox in A/B testing
- Courses:
- statistics how to (eye level explanations)
- Introductory statistical inference videos:
- Nice blog about inferential statistics
- ANOVA:
- Distribution zoo - visualise different distributions + code
- Casual Inference:
Alex Chan - A Plumber’s Guide to Git
- Markdown cheat sheet and Getting the gist of markdown's formatting syntax.
- How to import a python file from within another python file
- Selenium with Firefox
- Edit the hosts file to effectively block time wasting websites (tried on Mac, RPi):
sudo nano /etc/hosts
add new IP and domain name like so:
0.0.0.0 www.distracting-website.com
save, exit.
You might need to flush the DNS cache, like so (terminal, of course):
dscacheutil -flushcache
Make sure you redirect to 0.0.0.0. See here, for example, for more.