Skip to content

sodik82/IMDb-percentile

Repository files navigation

IMDB - neo4j

Project to better compare the movies and series based on the relative comparison and not "absolute" rating

Import data

  1. Download and decompress datasets into import/input
    • files: basics.tsv, episode.tsv, ratings.tsv
    • convert to csv
./tsv2csv < input/ratings.tsv > input/ratings.csv 
./tsv2csv < input/basics.tsv > input/basics.csv 
./tsv2csv < input/episode.tsv > input/episode.csv
  1. Start Neo4j - docker-compose up and optionally open
  2. Optional - change permissions on import/input since neo4j will change the permissons
  3. Import data './startImport.sh'
  4. Do statistics (per type: movie / tvSeries)
TYPE=movie ./doStatistics.sh | tee outputs/moviePerc.txt
cat outputs/moviePerc.txt | grep -v toFloat > outputs/moviePerc.csv

Start Cypher-shell in docker

docker exec -it imdb-neo4j_neo4j_1 bash
# cd /opt/my-scripts/
# cypher-shell -f <file>

Releases

No releases published

Packages

No packages published

Languages