akka-lift-ml

Info

Repository for an akka microservice that lift the trained spark ml algorithms as a actorsystem with http endpoints.

Description

akka-lift-ml helps you with the hard data engineering part, when you have found a good solutions with your data science team. The service can train your models on a remote spark instance and serve the results with a small local spark service. You can access it over http e.g. with the integrated swagger ui. To build your own system you need sbt and scala. The trained models are saved to AWS S3 and referenced in a postgres database, so can scale out your instances for load balacing.

Requirements

JDK8 http://www.oracle.com/technetwork/java/javase/downloads/index.html)
sbt(http://www.scala-sbt.org/release/docs/Getting-Started/Setup.html)
docker for dockerbuild (https://www.docker.com/community-edition/)
aws account if you want to use a cognito userpool for authentifaction (https://aws.amazon.com/de/)
enough memory for spark

Implemented Microservice Features

Integration of swagger-ui localhost:8080/v1/swagger/index.html
Autogenerated swagger doc from routes as yaml / json localhost:8080/v1/api-docs/swagger.yaml or localhost:8080/v1/api-docs/swagger.json
CRUD Repositorys via slick-repo
CORS Support via akka-http-cors
Implemented Authentication with AWS Cognito (JWK) and JWT Token via nimbusds (in Java)
Test coverage with ScalaTest and scoverage code coverage report
Ready for Docker deployment and CloudFormation deployment
Config file with optional runtime parameters
In-Memory Postgres SQL database for tests
Flyway database migration
HikariCP as connection pool
Logging via Log4j with a xml template

Supported ML Algorithms

Collaborative Filtering with ALS (Alternating-Least-Squares), even when the user is not in the rating

Planned Feature

Easy cleaning of data.
More spark mllib features
Add more and better tests

Configuration & QuickStart Guide

Prepare your data with 3 columns user,product,retaing - sample can be found in test resources (retail-raiting.csv)
If you want to train remote and not on your local machine, first start your Spark Cluster (Spark Cluster with 1x Master & 3x worker via Docker)
Checkout the source code from github -Start a PostgreSQL Database via RDS, Docker or locally
Make related config changes to application.conf or docker.conf
If you use AWS be sure that the s3 Bucket is not in EUROPE!! Spark 2.1 can not write/read data then
create a jar as a spark driver sbt package - be sure the path in application.conf is set correctly.
run sbt run
go to Swagger UI (http://localhost:8283/swagger/index.html)
send your request to the service
after successfull training you get the result via http get
run sbt docker:publishLocal to create a docker container image

For more details and instructions read the wiki.

Environment variables

SQL_URL - database url by scheme jdbc:postgresql://host:port/database-name
SQL_USER - database user
SQL_PASSWORD - database password
NIC_IP - IP Address bounded to the http service default is 0.0.0.0
NIC_PORT - TCP Port used for the http service default is 8080
USER_POOL - Define an other cognito user pool than the preconfigured userpool

Run application

To run application, call:

sbt run

Run in Docker

For launching application in Docker, you must configure database docker instance and run docker image, generated by sbt.

Generating application docker image and publishing on localhost:

sbt docker:publishLocal

Example of running, generated docker image:

docker run --name akkaHttp -m 6g -e SQL_USER=dbuser -e SQL_PASSWORD=dbpass -e SQL_URL=jdbcURL -d -p 8283:8283 APPLICATION_IMAGE

APPLICATION_IMAGE - id or name of application docker image

look at --link parameter if the database is also a docker container

Test

To run tests, call:

sbt test

To run all tests, with codecoverage, call:

sbt clean coverage test

To generate a coverage report afterwars the testrun, call:

sbt coverageReport

Contributers

Tobias Jonas

Other

akka-lift-ml is licensed under Apache License, Version 2.0.

Commercial Support innFactory Cloud & DataEngineering

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
project		project
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

akka-lift-ml

Info

Description

Requirements

Implemented Microservice Features

Supported ML Algorithms

Planned Feature

Configuration & QuickStart Guide

Environment variables

Run application

Run in Docker

Test

Contributers

Other

About

Releases

Packages

Languages

License

innFactory/akka-lift-ml

Folders and files

Latest commit

History

Repository files navigation

akka-lift-ml

Info

Description

Requirements

Implemented Microservice Features

Supported ML Algorithms

Planned Feature

Configuration & QuickStart Guide

Environment variables

Run application

Run in Docker

Test

Contributers

Other

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages