Skip to content
/ mal-api Public

Back-end web application using Python and Flask to build a Rest API by extracting data from My Anime List using web scraping.

Notifications You must be signed in to change notification settings

luk3mn/mal-api

Repository files navigation

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

MAL - API

Back-end web application using Python and Flask to build a Rest API by extracting data from My Anime List through web scraping.
Explore the docs »

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. License
  6. Contact
  7. Acknowledgments

About The Project

Home

In this project, it was necessary to split it into some parts, such as:

  • ETL Pipeline: Extraction using web scraping strategy, transformation of these data and load in the MongoDB database;
  • Database configuration: configuration to assist a connection from Flask application to MongoDB collections;
  • Build Rest API: It was created some endpoints API to consume these data from the database;
  • Deployment on AWS: It's time to make this project available on the internet, and to fulfill this step it's necessary to use containers docker to work with the application and database separately and an easy way to apply changes whenever we want;

Extract and Validation

All the data used in this project belongs to My Anime List extracted by web scraping method and it was possible by using the library "Beautiful Soup". During this process, it was able to go through several contents and organize them to store in a dictionary to facilitate some validation process before loading in the MongoDB database.

Load on MongoDB

To be able to create a connection between Flask and MongoDB, it was necessary to use the library "pymongo" which facilitated a bunch of features that included connection resources and collection manipulation.

REST API

The API endpoints were built using the Flask framework from Python and on top of that, it was needed to create a DTO class to limit the quantity of information during endpoint requests.


Extract new data from the data source

  GET /api/v1/anime/extract
Parameter Type Description
None None Required. to extract and load new data

List all anime

  GET /api/v1/anime
Parameter Type Description
None None to list all anime

Get by anime name

  GET /api/v1/anime/name/${anime_name}
Parameter Type Description
anime_name string to get anime by name

Get anime by genre

  GET /api/v1/anime/genre/${genre_name}
Parameter Type Description
genre_name string to get anime by genre

Get anime by rank

  GET /api/v1/anime/rank/${anime_rank}
Parameter Type Description
anime_rank integer to get anime by rank

Get anime by score

  GET /api/v1/anime/score/${anime_score}
Parameter Type Description
anime_score integer to get anime by score

(back to top)

Built With

  • Python
  • Flask
  • Mongo
  • Javascript

(back to top)

Getting Started

Here are some important topics about this project and how to replay it.

Prerequisites

  • virtualenv

    python3 -m venv .venv
  • Environment Variables

    To run this project, you will need to add the following environment variables to your .env file

    HOST

    PORT

    DB_NAME

Installation

Before starting this application in your local environment, it'll be necessary to proceed with some tasks to reproduce this project.

  1. Clone the repo
    git clone https://github.com/luk3mn/mal-api.git
  2. Install packages
    pip freeze -r requirements.txt

(back to top)

Usage / Examples

This project can be deployed on AWS simply by using an EC2 instance and releasing port 5000 to the Anywhere IP address. Once the instance is working, just follow the Deplymet steps next and use an IP address allocated to the EC2 instance on port 5000 on Postman, APIDOG or whatever application that allows the testing of web APIs.

Screenshots

  • GET /api/v1/anime Anime Route

  • GET /api/v1/anime/rank/1 Anime Route

  • GET /api/v1/anime/rank/50 Anime Route

  • Running on AWS EC2 using Docker containers Anime Route

Deployment

To deploy this project run

  • Install Docker Engine

  • docker-compose

    sudo apt install docker-compose
  • Running the application and MongoDB using containers

    sudo docker-compose up -d

Roadmap

Processing

  • Extract: get data from the source using web scraping
  • Transform: to valid some information before storing it in the database
  • Load: store data in MongoDB database

MongoDB

  • Database configuration
  • Working on repository class

API Rest

  • GET /api/v1/anime/extract
  • GET /api/v1/anime
  • GET /api/v1/anime/name/{anime_name}
  • GET /api/v1/anime/genre/{genre_name}
  • GET /api/v1/anime/rank/{anime_rank}
  • GET /api/v1/anime/score/{anime_score}

Docker

  • Run Python application by docker
  • Run MongoDB database by docker

Deploy

  • AWS

(back to top)

Lessons Learned

This project was an excellent learning object for me. I was able to deep into REST API architecture using Python and Flask, ways to use Docker to deploy an application in a container by using docker-compose and Dockerfile, and finally to get running the application on the cloud using AWS EC2 Instances.

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Authors

Feedback

If you have any feedback, please reach out to us at lucasnunes2030@gmail.com

Project Link: https://github.com/luk3mn/mal-api

(back to top)

Acknowledgments

I think it would be interesting to place here some references and other resources that were useful and helped me to work on this project. I hope it can help you as well!

(back to top)

About

Back-end web application using Python and Flask to build a Rest API by extracting data from My Anime List using web scraping.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages