Skip to content

Latest commit

 

History

History
123 lines (94 loc) · 3.71 KB

README.md

File metadata and controls

123 lines (94 loc) · 3.71 KB

Server

In the second part of this project, we will create an API for the categorization model developed previously.

More specifically, you will develop a server that should receive data related to products and return the best categories for them using a pretrained model.

More info about the data can be found here.

Server API

Your API should be composed of the following components:

  1. Model Loading
    Loads a pretrained model from a specified path available in the environment variable MODEL_PATH.

  2. Categorization Endpoint
    Exposes a POST endpoint /v1/categorize that receives a JSON with product data and returns a JSON with their predicted categories.

  3. Input Validation
    Returns status 400 (Bad Request) in case of ill-formatted user input without killing the API.

  4. [BONUS] Contract Testing
    Runs automated tests from a file test_api.py to validate the API responses according to different inputs.

NOTE: To test your API, you must provide a JSON file generated from the dataset test_producs.csv, containing a valid input for your API implementation. This file should be saved in the path available in the environment variable TEST_PRODUCTS_PATH.

Implementation

The server API should be implemented using the Flask Library in a file named api.py.

Use Python comments to document relevant details about your implementation. Remember that good documentation should focus on the why (e.g., why a specific type of model was chosen), since clean code should be enough to understand the how (e.g., how you selected a specific type of model).

Input

The expected input for the server should follow the following schema:

{
  "products": [
    {
      "title": "Lembrancinha"
    },
    {
      "title": "Carrinho de Bebê"
    }
  ]
}

You MAY expect to receive other fields besides the title to represent the products. Remember, however, to use as key the name of the field specified in the raw data.

Output

The expected output from the server should follow the following schema:

{
  "categories": [
    "Lembrancinha",
    "Bebê"
  ]
}

You MUST NOT send other fields besides the category.

Infrastructure

In this directory, we provide a containerized environment that uses docker and docker-compose to run the API. This should standardize the development environment and avoid compatibility problems.

To install docker and docker-compose, check their official documentation here and here. Both tools should be instalable at Linux, MacOS and Windows.

To execute the API, just run the following command:

docker-compose up --build

Then open the link shown in the end.

To install an OS package (Debian-based), add the name of the package in the file packages.txt. To intall a Python package (Pip-based), add the name and version of the package in the file requirements.txt.

Evaluation

The evaluation will be based on four criteria:

  1. Correctness
    If the solution runs without unexpected errors.

  2. Compliance
    If the solution respects all specified behaviors, in particular concerning inputs and outputs.

  3. Code Quality
    If the solution follows the principles of clean code and general good practices discussed in class.

  4. Documentation
    If the solution documents relevant decisions in the right measure.