-
Notifications
You must be signed in to change notification settings - Fork 5
Home
Behnam Yazdanpanahi edited this page May 9, 2024
·
5 revisions
Welcome to the PythonForDataEngineeringCourse wiki!
Python Fundamentals
- Introduction
- First Python program
- Basic Data types (int, float, str, bool)
- Variables, Constants and operators
- Control flow (if statements, loops)
-
Data structures (lists, tuples, dictionaries, sets)
- Lists
- Tuples
- Dictionaries
- Sets
- String manipulation
- Functions and modules
- Error handling (try-except blocks)
Assignment: Ass1
Data Manipulation with Pandas
- Introduction to Pandas and its data structures (Series, DataFrame)
- Data manipulation (filtering, sorting, grouping)
- Reading and writing data from/to different file formats
- Data cleaning and transformation techniques
- Handling missing data
- Merging and joining DataFrames
Numerical Computations with NumPy
- Introduction to NumPy arrays and mathematical operations
- Array operations (slicing, indexing, etc.)
- Array manipulation (reshaping, stacking, splitting)
- Working with random numbers
- Linear algebra operations
Assignment: Ass2
Data Visualization with Matplotlib and Seaborn
- Data visualization basics
- Line plots, scatter plots, bar plots
- Histograms, box plots, violin plots
- Customizing plots and adding labels, titles, etc.
Assignment: Ass3
Web Crawling with Requests, BeautifulSoup and Selenium
- Introduction to APIs
- Accessing data from APIs
- Introduction to web scraping
- Working with HTML structure
- Scraping data from websites using BeautifulSoup
- Handling dynamic content with Selenium (JavaScript rendering)
- Parsing data from web pages
Assignment: Ass4
Object-Oriented Programming (OOP)
- Object-oriented programming concepts (classes, objects, inheritance, polymorphism)
- Design patterns and best practices in OOP
Working with Data Sources and Storages, and Serialization
- Reading and writing data from/to various file formats
- Introduction to data serialization formats (JSON, Parquet, Pickle)
- Serializing and deserializing data objects
- Best practices for data serialization and storage
Assignment: Ass5
Working with SQL Databases
- Introduction to SQL and relational databases
- SQL basics (SELECT, FROM, WHERE, JOIN)
- Creating and managing databases, tables, and indexes
- CRUD operations (Create, Read, Update, Delete)
- Connecting to databases
- Executing SQL queries
- Fetching and manipulating data with SQL
- Using SQLAlchemy for database interaction
Working with NoSQL Databases
- Understanding NoSQL databases (e.g., MongoDB, Redis)
- Connecting to NoSQL databases
- Querying and manipulating data in NoSQL databases
- Handling document-based and key-value data models
Assignment: Ass6
Data Pipelines
- ETL (Extract, Transform, Load)
- Understanding data pipelines and their components
- Designing and architecting data pipelines
- Implementing data ingestion, transformation, and loading (ETL)
Assignment: Ass7
Project Development
- Apply all the concepts learned in a real-world data engineering project
- Work with various data sources including web data and APIs
- Implement ETL pipelines, data processing, and analysis using Python libraries and tools
Project: Final Project