-
- Introduction to Pandas
- Basics statistics in data analysis (descriptive statistics, shape of distribution, inferential statistics)
- Pandas Series
- Pandas DataFrame
- Reading and writing data with Pandas
- Accessing and selecting data
- Data manipulation (filtering, sorting, grouping)
- Data Cleaning
- Merge, Join and Concatenating
-
Assignment 2 : Exploratory Data Analysis (EDA) using Pandas, NumPy, and Matplotlib
- Introduction to Web Scraping and APIs
- Basics of HTML, CSS, and JavaScript
- Getting Started with Web Scraping Using Python
- Dealing with Pagination and Infinite Scrolling
- Working with APIs
- Assignment 1: Scraping Jobinja website, fetch last year jobs information and save into a CSV file.
- Introduction to I/O (Input/Output)
- Importance of File I/O (Input/Output)
- Basic Concepts of File I/O
- Text Files
- CSV Files
- JSON (JavaScript Object Notation) Files
- XML Files
- Excel Files
- Binary Files
- Database File Formats (SQLite)
- Parquet and pyArrow
Object-Oriented Programming (OOP)
- Object-oriented programming concepts (classes, objects, inheritance, polymorphism)
- Design patterns and best practices in OOP
Working with SQL Databases
- Introduction to SQL and relational databases
- SQL basics (SELECT, FROM, WHERE, JOIN)
- Creating and managing databases, tables, and indexes
- CRUD operations (Create, Read, Update, Delete)
- Connecting to databases
- Executing SQL queries
- Fetching and manipulating data with SQL
- Using SQLAlchemy for database interaction
Working with NoSQL Databases
- Understanding NoSQL databases (e.g., MongoDB, Redis)
- Connecting to NoSQL databases
- Querying and manipulating data in NoSQL databases
- Handling document-based and key-value data models
Assignment: Ass7
Data Pipelines
- ETL (Extract, Transform, Load)
- Understanding data pipelines and their components
- Designing and architecting data pipelines
- Implementing data ingestion, transformation, and loading (ETL)
Assignment: Ass8
Project Development
- Apply all the concepts learned in a real-world data engineering project
- Work with various data sources including web data and APIs
- Implement ETL pipelines, data processing, and analysis using Python libraries and tools