-
Notifications
You must be signed in to change notification settings - Fork 4
Endor
At the hackathon, participants will get hands-on exposure to Endor’s proprietary Social Physics technology. To analyze big data from the music industry and generate new insights, participants will use part of Endor’s technology—creating new predictions based on Last.fm consumer behavior.
Last.fm has kindly provided us with real-time consumer data. Participants will receive data on artists, songs, albums, transactions and users with the aim of creating new predictions.
Important: Please follow the instructions below as they have been updated since distributing the following demo: https://www.youtube.com/watch?v=LpUv1cIB9hw
(1) Verify you have the following Operating System:
Any OS which is UNIX based.
(2) Install Docker CE on your local device:
https://docs.docker.com/install/
(3) Run the following bash script on your local device:
bash <(curl -s https://raw.githubusercontent.com/AthenaWisdom/standalone_scorer/master/bin/run.sh)
Open your device’s Terminal to run the bash script, from which the docker file is also pulled to your local device. Upon running the script, a docker file will be downloaded with a folder containing files. One of these files is a CSV file which you will work on.
(4) Retrieve data from the following public s3 bucket:
https://s3.amazonaws.com/endor-hackathon/
This s3 bucket houses consumer data from Last.fm. Review the data to come up with questions suitable to base predictions off of. Read the Additional Guidelines section below.
(5) Open the CSV file located in the folder from the bash script:
./input/input/kernel.csv
Use this file to make modifications which reflect the prediction you plan to generate. Modify this file by using modification tools such as Excel, SQL and R (or you can do so manually).
(6) Re-run the bash script on your local device:
./output/[DATE]/all_scores.csv
After re-running the bash script and opening the docker image’s folder, look for the ./output/[DATE]/all_scores.csv file to view results. You will see probabilities attached to IDs, predicting the likelihood of these IDs to exhibit expected behavior X.
-
Review Last.fm’s consumer data (located in the public s3 bucket) in great detail. The better you understand the data, the more interesting questions you will be able to design.
-
Once you have read the data, take the time to pen down future oriented questions about Last.fm’s music consumers. Center questions on the format “Who is likely to X?”
-
Pick a question to start with. Identify which data from Last.fm’s consumer data is relevant to that question. Note down the data so that you can easily trace it in the CSV file.
-
You will want to focus on identifying the following kinds of consumer data:
- A subset of the population which already displays behavior X. (Recall: “Who is likely to X?”)
- A subset of the population for which you want to know how likely they will display behavior X.
- Modify the CSV file in the following three columns: Universe, White, Ground. Modify these columns for the population IDs that you have identified from Last.fm’s consumer data.
- Social Sphere: All data. In our case, this is all of Last.fm’s data available in the public s3 bucket.
- Universe: A subset of the population for which you are predicting the likelihood of behavior X.
- White: Another subset of the population which already displays that same behavior.
- Ground: The final population subset which is (most) likely to display that same behavior.
- Kernel: A table which contains Universe, White and Ground data. In our case, this is the CSV file.
Note: Every Social Sphere entails Universes, and each Universe entails Whites.
Catalysts - Using Social Physics technology on the application layer
FakeLabels.com is interested in tracking fraudulent music labels who pretend to distribute new artists’ music, thereby scamming artists into paying service fees. Using real-time data of fraudulent behavior, FakeLabels.com can use Endor’s automated predictions engine to predict other population clusters who may be prone to displaying similar fraudulent behavior. Leveraging blockchain infrastructure for the use of predictive analytics, FakeLabels.com allows music industry peers to share big data across the platform.
Developers - Using Social Physics technology on the scorers layer
NextSpotify.com is developing a next-generation predictive analytics service using blockchain infrastructure. NextSpotify.com has migrated all of Spotify’s data to blockchain infrastructure with the aim of doing predictive analytics on this data using Social Physics technology. Given that Spotify already applies its own predictive analytics for song recommendation, NextSpotify.com wants to merge Social Physics technology with Spotify’s predictive analytics in order to have additional recommendation insights. By combining Spotify’s scorers with Social Physics scorers, NextSpotify.com can rank population clusters in novel ways and generate new predictions which would not have come out of Spotify’s system alone.