PicPay' Technical Challenge: MLOps

The focus of this challenge is to build up a data pipeline architecture on AWS using Terraform. At the end of this process, we should be able to fetch a structured table from s3 to train a machine learning model.

Regarding the machine learning problem, it's a regression type problem that aims to estimates the ibu (International Bitterness Units) of a beer.

Data Dependencies

We are using the following Data Source:

Source	Description
Punk API	Info about a number of artisanal beers.

Column Inputs

Column	Type	Description
id	int	Beer's ID
name	str	Beer's Name
abv	float	The Beer's alcohol by volume
ibu	float	The Beer's international bittering unit
target_fg	float	The Beer's final gravity
target_og	float	The Beer's original gravity
ebc	float	A modern brew system to specify beer color
srm	float	Likewise the ebc, it's measure the beer color
ph	float	The Beer's ph

Repository Structure

terraform: The terraform configuration to create the required architecure. See the Installation section for more details.
analysis: A Jupyter Notebook demonstration on how the fetched data was used to train a machine learning model.
iac_exercise: The project modules.
tests: The project's test directory.

Installation

Run the command below on the root directory to install the proper dependencies.

pip install .

To replicate the architecture, its necessary to set your AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and REGION_NAME as environment variables.

export AWS_ACCESS_KEY_ID="anaccesskey"
export AWS_SECRET_ACCESS_KEY="asecretkey"
export AWS_DEFAULT_REGION="yourregion"

The next step is to zip the lambda functions so that they can be uploaded properly into your AWS account.

cd terraform
cd lambda
zip fetch_data_from_api.zip fetch_data_from_api.py
zip preprocess_data.zip preprocess_data.py

Usage

From the root directory, run the commands below:

cd terraform
terraform init
terraform apply --auto-approve

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github/workflows		.github/workflows
analysis		analysis
data/processed		data/processed
iac_exercise		iac_exercise
terraform		terraform
tests		tests
trained_models		trained_models
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
config.yml		config.yml
mypy.ini		mypy.ini
requirements.txt		requirements.txt
setup.py		setup.py
test_requirements.txt		test_requirements.txt
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PicPay' Technical Challenge: MLOps

Data Dependencies

Column Inputs

Repository Structure

Installation

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PicPay' Technical Challenge: MLOps

Data Dependencies

Column Inputs

Repository Structure

Installation

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages