Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Data Processing in Python (JEM207)

Stable link for online attendance:

Ad hoc lecture 3 - sorry for the issues -

The course site for the Data Processing in Python from IES. See information on SIS. The course is taught by Martin Hronec, Vítek Macháček and Jan Šíla.

Date Topic who Project HW
5/10 Intro, Jupyter, Git (+ GitHub) Martin
11/10 Seminar (Git) Martin HW 1
12/10 Strings, Floats, Lists, Dictionaries, Functions Vitek HW 0
19/10 Numpy, Pandas, Matplotlib Jan HW 2
25/10 Seminar Jan
26/10 Object-Oriented Programming Martin HW 3
2/11 HTML, XML, JSON, requests, APIs, BeautifulSoup Jan
8/11 IES Web Scraper Vitek HW 4
9/11 Seminar Vitek
16/11 Advanced Pandas Vitek HW 5
22/11 Seminar - MIDTERM full house
23/11 Introduction to Databases Jan Project Topic Proposal HW 6
30/11 Packaging and Documentation Martin
6/12 Testing (and decorators) Martin
7/12 Seminar Martin Project Topic Approval
14/12 Guest lecture TBD
20/12 Project Work 2 (Seminar) full house Work-in-progress
21/12 Project Work 2 full house Work-in-progress
TBA Project Deadline full house

Course requirements

The requirements for passing the course are DataCamp assignments (5pts), the midterm (25pts), work in-progress-presentation (10pts), and the final project - including the final delivery presentation (60pts). At least 50% from the DataCamp assignments and work-in-progress presentation is required for passing the course.

Final project (60%)

  • Students in teams by 2
  • Deadline: TBA
  • The task is to download any data from API or directly from the web. These data should be processed and visualized in the Jupyter Notebook, with auxiliary scripts consisting of functions and classes definitions as .py files. The project should be submitted as a GitHub repository.
  • The selection of the data is up to the students. (Conditional on our approval.)
  • Git collaboration as a proof of collaboration of both students.
  • More details during the lecture.

Projects' Evaluation critera

  • Submitted as a Jupyter notebook in a Git repository. All team members pushed to the repo.
  • Code is runnable and replicable (after installation of necessary packages).Exception only due to good reasons (data availablity, etc)
  • OOP and code structure
  • Analysis and visualization
  • Code Readibility + Documentation

See example project from the previous semesters here from last year.

Project work - presentation (10%)

  • Presentation of work-in-progress related to the final project.

Midterm exam (25%)

22/11. Live coding (80 minutes), "open browser", no collaboration between the students. More details during the lecture week before

DataCamp Assignments (5%)

3 assignments out of assignments 1-6 submitted on time is required.

10/10 18:20 - Introduction to Git for Data Science

12/10 18:20

*Deadline extended to Oct 17th at 23:59

19/10 18:30 - Introduction to Data Science in Python

26/10 18:30 - Object-Oriented Programming in Python


Recommended DataCamp Courses


Introduction to Git for Data Science

General Python

Introduction to Python

Intermediate Python for Data Science


pandas Foundations

Manipulating DataFrames with pandas

Merging DataFrames with pandas

Cleaning Data in Python

Web Data Formats

Importing Data in Python (Part 1)

Importing Data in Python (Part 2)

Web Scraping with Python

Data Visualizations

Introduction to Data Visualization

Interactive Data Visualization in Bokeh


Introduction to SQL for Data Science

Introduction to Databases in Python


Econometrics II. (JEB110) is an explicit prerequisite for bachelor students.

The course is designed for students that have at least some basic coding experience. It does not need to be very advanced, but they should be aware of concepts such as for loop ,if and else,variable or function.

No knowledge of Python is required for entering the course.


Passing the course is rewarded with 5 ECTS credits.

A sneak peek

IES web parser.



Pro Git book, Atlassian Git tutorials, Github resources for learning Git


Resources from the official Python webpage


Python, Pandas, Numpy, requests, BeautifulSoup and Matplotlib.


The course site for the Data Processing in Python from IES



No releases published


No packages published