This is meant to be a short course that helps you to start programming in Python, especially if you have little to no background in programming. There is no need for yet another course on how to learn Python, instead I will select a subset of topics that are likely to be interesting for researchers, give a short introduction and point towards resources where you can learn more. First we will cover the basics of how to program in Python and get everyone up and running. After having mastered the very basics, each student will design a small programming project, e.g. downloading data from the internet, reorganizing a dataset, statistical analysis of some data... In general many repetitive tasks can be automated in a python script and we'll learn how to do that in the next few weeks. Over time you will appreciate the versatility and simplicity of Python!
Some Links:
- Python Getting Started
- Automate the Boring Stuff is an online course focusing on beginners
- Curated list of links related to Python
- Data Science related Notebooks
- Data Quest Web Scraping Tutorial
-
Before we start with Python, you should make yourself familiar with Git and Github. Git is a distributed version control system, mainly used for code. To get familiar with git follow this tutorial
-
Install Python. I suggest the Anaconda Distribution. It comes with many packages preinstalled and an IDE similar to RStudio. Please install python 3.*
- check for the right link by searching for it: e.g. search for
conda install unidecode
- you will find this
- using conda you run:
conda install -c anaconda unidecode
(This is NOT a python command!)
- open the "Anaconda prompt" on Windows, or the "terminal" on Mac/Linux
- type
conda install -c anaconda unidecode
and press enter, say yes if conda asks you to install packages
0.1. Make a repository on github put some file in it and push it to the server 0.2. Open Spyder and run the "Hello World!" program.
- Basic introduction: Define variables, lists and dictionaries
- Operators on numbers/strings
- Data types
- Booleans
- if-else, for-, while-loops
After this class you know the most important programming concepts, based on which you can write almost any program.
- Revisit basics
- Functions
- import a package
- numpy/scipy/pandas
- urllib
- File IO using pandas and open/csv
- Beautiful Soup
- Selenium, for dynamic interaction with websites.
From now on, classes will be more like a QA session. You can only learn programming by doing, therefore it's important to get started on your projects and along the way I can help with issues that come up.
- Plan your own project using python, what concepts do you need to learn? What packages do you need?
Don't know where to start/what to do? Checkout these links:
- edX Course on Python for Research
- Andy Halterman for stuff on creating/analyzing event data: https://andrewhalterman.com/2017/05/08/making-event-data-from-scratch-a-step-by-step-guide/ or his Github page
- Natural Language Processing: e.g. Spacy
- Face Recognition (as a service) by Face++
- Extract text from PDFs: pdfminer
For some tips on best practices for your own code see the top answer here.
- How to use a Web API (Follow the Money example)
- Projects?
- Work on personal project
- Help with issues and setup of project.
- students should present their use case