Skip to content

spannozzo/micro-rss

Repository files navigation

Distribute RSS Reader

Creation of a series of microsevices with Rest CRUD operation for managing feeds and feed items, and storing incoming feed items automatically.

Tech stack:Docker, Quarkus, Swagger, open API 3, Kafka, Postgres DB, Maven, Quarkus and Eclipse Microprofile specifications.

Description

The Rest application is managing Feeds (with name, description, feed url) at localhost:8080/feeds. In this way I'm able to manage multiple feeds since I will refer to a specific feed with its id. For example I can use

curl -X POST "http://localhost:8082/feeds" -H  "accept: application/json" -H  "Content-Type: application/json" -d "{\"description\":\"Dutch online news feed\",\"name\":\"NOS Nieuws\",\"url\":\"http://feeds.nos.nl/nosjournaal?format=xml\"}"

for storing this feed information. The rest call will give me back the feed id (1) for the saved feed. Then I can use this id for scanning periodically the incoming feed items with

curl --location --request GET 'localhost:8080/feeds/stream/v1/1'

This Rest call will check and parse the feed items and will send the information through a separate kafka broker.

The broker will dispatch the information to another kafka topic that will check if the feed is already stored(by checking feed item title and published date) and if not, it will save into the database.

For parsing feeds I used an opensource apache library called Rome. This library support also the possibility to scan different feed urls together, so it will be easy eventually to modify the rest call in order to support multiple feeds by passing many feed ids.

Since the feed and feed items are on separate dbs, and I have a "one to many" relationship from feed to feed-item, on each row of item table I'm associating the row with its feed not with feed id but with the rest url that provides the feed data.

E.g. instead of 1 I will have localhost:8082/feeds/1. Moreover the feed deletion is logical, for avoiding unconsistencies.

For checking stored feeds you can use

curl --location --request GET 'http://localhost:8082/feeds/'

or you can retrieve a specif feed by adding the id to that url

curl --location --request GET 'http://localhost:8082/feeds/1'

while for checking stored feed items you can use

curl --location --request GET 'http://localhost:8083/feeds/items/'

or you can retrieve a specif item by adding the id to that url

curl --location --request GET 'http://localhost:8083/feeds/items/1'

Getting Started / Installing

You will need Docker, java 11 and Maven.

First step is to run in each sub-folder mvn clean package. After that, from the main folder you should run docker-compose --build -d.

At this point it should be everything up and running, so you can import into postman the curls commands showed before (or running them directly if you are using linux).

You can also use the docker-compose-local.yml in order to have up and running only the required application infrastructure (dbs and kafka) and run each application in local with mvn quarkus:dev -Ddebug=false from each application sub-folder.

Dependencies

Docker Kafka Postgres DB Maven Quarkus java 11

Code technical choices explanation

Quarkus needs to avoid private fields in order to run the application in native mode, and Panache entities need to have public fields.

For Item database table I used TEXT instead of varchar, for performance reason (it is not affecting speed (ref: https://stackoverflow.com/questions/4848964/difference-between-text-and-varchar-character-varying)) and for handling item title, image path and description that can be longer than 255 chars.

DB Dump

You need to run from main folder

docker ps 

and check the corresponding image id for the databases feed_db and feed-item_db. At this point you can export the inserted rows and the table structures with

docker exec -it -e PGPASSWORD=1234 9291272e5b75 pg_dump -p 5431 -U root feed_db > feed-db_dump_all.sql
docker exec -it -e PGPASSWORD=1234 11852fb661e3 pg_dump -p 5432 -U root item-db > item-db_dump_all.sql

Since in my case the container ids for the corresponding databases are '9291272e5b75' and '11852fb661e3'. The dump files will contains the tables defintion and the data within COPY SQL command block. If you want the INSERTO INTO SQL commands you can run:

docker exec -it -e PGPASSWORD=1234 9291272e5b75 pg_dump -p 5431 -U root feed_db --column-inserts --data-only > feed-db_dump.sql
docker exec -it -e PGPASSWORD=1234 11852fb661e3 pg_dump -p 5432 -U root item-db --column-inserts --data-only > item-db_dump.sql

Version History

  • 0.0.1-SNAPSHOT
    • Initial Release

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages