Revolve Solutions Data Test - Starter Project

Prerequisites

Java JDK 8

Go to https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
and under the section "Java SE Development Kit 8u191" (the final digits may vary at the time you're reading this) click the Accept License Agreement radio button and download the version appropriate to your operating system.

Python 3.6.* or later.

See installation instructions at: https://www.python.org/downloads/

Check you have python3 installed:
python3 --version

Preferably an IDE such as Pycharm Community Edition

https://www.jetbrains.com/pycharm/download/

Dependencies and data

Creating a virtual environment

Ensure your pip (package manager) is up to date:
pip3 install --upgrade pip
To check your pip version run:
pip3 --version
python get-pip.pypython get-pip.py Install virtualenv:
pip3 install virtualenv
Create the virtual environment in the root of the cloned project:
virtualenv -p python3 .venv

Activating the newly created virtual environment

You always want your virtual environment to be active when working on this project.
source ./.venv/bin/activate 

Installing Python requirements

This will install some of the packages you might find useful:
pip3 install -r ./requirements.txt

Running tests to ensure everything is working correctly

pytest ./tests

Generating the data

A data generator is included as part of the project in ./input_data_generator/main_data_generator.py This allows you to generate a configurable number of months of data. Although the technical test specification mentions 6 months of data, it's best to generate less than that initially to help improve the debugging process.

To run the data generator use:
python ./input_data_generator/main_data_generator.py

This should produce customers, products and transaction data under ./input_data/starter

Getting started

The skeleton of a possible solution is provided in ./solution/solution_start.py You do not have to use this code if you want to approach the problem in a different way.

Requirement

Customer Shopping Patterns The task involves developing a data pipeline to complete the user story above using sample data sources that will be provided.

Our client is a major high street retailer that handles millions of transactions each day. Their data science team has reached out to our data engineering team >requesting we pre-process some of the data for them at scale so that they can make better use of it in their downstream algorithms. They would like us to deliver >this data weekly.

The input data sources are comprised of customers (in CSV format), transactions (in JSON Lines format) and products (in CSV format).

Acceptance Criteria

The output json should contain information for every customer and has the following fields: customer_id, loyalty_score, product_id, product_category, purchase_count

The code and design should meet the above requirements, follow best practices and should consider future extension or maintenance by different members of the team

Create a Pull Request with appropriate details.

Hints

Use the solution_start.py file to start with

Write small functions which do only one thing

Write test cases for every functions except main

Add comments, logs, and use typing

Add appropriate exception handling

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
input_data/starter		input_data/starter
inputs_data_generator		inputs_data_generator
solution		solution
README.md		README.md
output.json		output.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Revolve Solutions Data Test - Starter Project

Prerequisites

Java JDK 8

Python 3.6.* or later.

Preferably an IDE such as Pycharm Community Edition

Dependencies and data

Creating a virtual environment

Activating the newly created virtual environment

Installing Python requirements

Running tests to ensure everything is working correctly

Generating the data

Getting started

Requirement

Acceptance Criteria

Hints

About

Uh oh!

Releases

Packages

Uh oh!

Languages

dakshkayat/Python-Assignment

Folders and files

Latest commit

History

Repository files navigation

Revolve Solutions Data Test - Starter Project

Prerequisites

Java JDK 8

Python 3.6.* or later.

Preferably an IDE such as Pycharm Community Edition

Dependencies and data

Creating a virtual environment

Activating the newly created virtual environment

Installing Python requirements

Running tests to ensure everything is working correctly

Generating the data

Getting started

Requirement

Acceptance Criteria

Hints

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages