Data Engineering Challenge: New York Taxi Data Processing

Project Overview

This solution aims to design and implement a scalable data pipeline that extracts New York Taxi Trip data, processes it to derive analytical insights, and loads the processed data into a data warehouse for further analysis.

Environment Setup

Prerequisites

Python 3.8+
SQLite

Installation

Clone the repository:

git clone <repository_url>
cd New_York_Assignment

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the required packages:
```
pip install -r prerequisites.txt
```

Running the Project

Data Extraction

To download the CSV files for the year 2019:

python scripts/download_data.py

Convert parquet to csv

To convert downloaded data to csv

python scripts/parquet_to_csv.py

Data Processing

To clean and transform the downloaded data:

python scripts/processed_data.py

Data Loading

To load the data into database:

python scripts/loading_data.py

Data Analysis

To generate insights and visualizations:

python scripts/analysis_data.py

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.idea		.idea
scripts		scripts
New_York_Taxi.db		New_York_Taxi.db
README.md		README.md
prerequisites.txt		prerequisites.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Engineering Challenge: New York Taxi Data Processing

Project Overview

Environment Setup

Prerequisites

Installation

Running the Project

Data Extraction

Convert parquet to csv

Data Processing

Data Loading

Data Analysis

About

Releases

Packages

Languages

MayurKayastha/New_York_Assignment

Folders and files

Latest commit

History

Repository files navigation

Data Engineering Challenge: New York Taxi Data Processing

Project Overview

Environment Setup

Prerequisites

Installation

Running the Project

Data Extraction

Convert parquet to csv

Data Processing

Data Loading

Data Analysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages