Project description and instruction for launching

Description

This project is an extension of data engineering group project done previously, as part of accrediatation for Data Engineering course at UCL. Whilst the coursework scope stays the same, this project:

Successfully merges 3 parquet files scraped from 2 websites: Open Sea, NFT Showroom, and adds 4 more data tables into the RDS.
Deploys machine learning pipeline, predicting a number of total sales of an NFT collection, based on its features. The sales vary from 0 to 10.

The project's objective is to create more NFT datasets currently lacking on Kaggle and eradicate the problem of weak labelling in the digital art industry.

Instruction

This is a guide on how to run the project using your Docker.

Type the following in your terminal: git clone https://github.com/marfappv/data_eng_ind.git
Make sure dockerfile is run properly from python-docker folder. It will install all necessary libraries to run the Machine Learning pipeline code.
Type the following in your terminal: python3 main.py

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
ML_training		ML_training
airflow_home		airflow_home
data_sourcing		data_sourcing
parquet-files		parquet-files
python-docker		python-docker
schema_rds		schema_rds
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project description and instruction for launching

Description

Instruction

About

Releases

Packages

Languages

marfappv/data-engineering-individual

Folders and files

Latest commit

History

Repository files navigation

Project description and instruction for launching

Description

Instruction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages