Project to better compare the movies and series based on the relative comparison and not "absolute" rating
- Download and decompress datasets into
import/input
- files:
basics.tsv
,episode.tsv
,ratings.tsv
- convert to csv
- files:
./tsv2csv < input/ratings.tsv > input/ratings.csv
./tsv2csv < input/basics.tsv > input/basics.csv
./tsv2csv < input/episode.tsv > input/episode.csv
- Start Neo4j -
docker-compose up
and optionally open - Optional - change permissions on
import/input
since neo4j will change the permissons - Import data './startImport.sh'
- Do statistics (per type:
movie
/tvSeries
)
TYPE=movie ./doStatistics.sh | tee outputs/moviePerc.txt
cat outputs/moviePerc.txt | grep -v toFloat > outputs/moviePerc.csv
docker exec -it imdb-neo4j_neo4j_1 bash
# cd /opt/my-scripts/
# cypher-shell -f <file>