KptnCook data challenge

Objectives

Create an endpoint to get the number of times user x listenned to artist y:
- user_id
- Number of listens
Create an endpoint to update the listens for artists y by user x:
- user_id
- Update of Number of listens
- Save dataset to csv tu use it back later
Create an endpoint for each user to get recommandations such as:
- Random artists: taking a sample of 5 artists
- Still unknown artists: taking a sample of 5 never listenned artists
- Artists that are similar to the ones already listenned

Environment configuration

To build the docker image:

docker build -t kptncook-data-challenge .

To check vulnerabilities from docker image:

docker scan kptncook-data-challenge

To run the docker image:

docker run -d --name kptncook-data-challenge_container -p 80:80 kptncook-data-challenge
out: 176b0cb9897582f3c923fd3c179cb700415762ddc96e081176adb726a3a681e5

Commands to run after a code update

docker stop kptncook-data-challenge_container
docker rm kptncook-data-challenge_container
docker build -t kptncook-data-challenge .  
docker run -d --name kptncook-data-challenge_container -p 80:80 kptncook-data-challenge

or simply execute run_docker.bat

To access api : http://localhost

To see the doc: http://127.0.0.1/docs

Recommendations based on similar artists

To recommend similar artists to the ones already listenned, here are the steps:

5 already listenned artist are selected (knwon artists)
corrwith() Pandas' function is used to find each 5 previous selected most similar artist, based on a correlation rank.

corrwith() allows us to choose among 3 different correlation ranks : pearson, kendall, spearman.
If no method is precised, corrwith uses Pearson coefficient which works as follows:

Pearson correlation coefficient value	Strength	Direction
Greater than .5	Strong	Positive
Between .3 and .5	Moderate	Positive
Between 0 and .3	Weak	Positive
0	None	None
Between 0 and –.3	Weak	Negative
Between –.3 and –.5	Moderate	Negative
Less than –.5	Strong	Negative

In the results, mosts of recommendations are between 0 and .3. It shows a weak correlation, however not null.
After some tries with the other 2 ranking method, results appears to be sensitively the same.One hypothesis is that users of this dataset have not listened to enough music, which makes ranking coefficients less effective.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
app		app
code		code
Dockerfile		Dockerfile
README.md		README.md
lastfm-matrix-germany.csv		lastfm-matrix-germany.csv
requirements.txt		requirements.txt
run_docker.bat		run_docker.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KptnCook data challenge

Objectives

Environment configuration

Recommendations based on similar artists

Acknowledgments

About

Releases

Packages

Languages

clementmariebrisson/kptncook-data-challenge

Folders and files

Latest commit

History

Repository files navigation

KptnCook data challenge

Objectives

Environment configuration

Recommendations based on similar artists

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages