Skip to content

GitCass01/italian-cinemas-sciviz

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

italian-cinemas-sciviz

Project created for the "Visualizzazione Scientifica" course in 2022-2023.

Presentation of my results

How I extracted the data

  • Most of the data that i found, was in pdf format, the "good" thing was that the data itself was in a tabular format, so i used a tool, tabula, written in java to extract the data in csv format.
  • I couldn't find a way to automate the process, although there is also a python version of tabula, but, in this case was less effective than the original tool.
  • Also, sometimes, the tool wasn't able to get all the data, or it got partial data or it merged multiple column in ones, so i had to restore the data manually. I also used an extension on vscode called edit csv, in order to facilitate this task. This is also his github.
  • Other data was in excel format, but also in this case i couldn't find a way to automate the process, because the files structure was the same only for some years, and the spreadsheet names were different too. This time the data was fewer and the copy/paste was also easier.
  • In order to create the italian actors network graph, i used the TMDB API through tmdbsimple, a python wrapper for this api, and saved the data in json format.

.env for TMDB api key

In order to re-run my code about the actors network you have to:

  • get a TMDB api key by following the official documentation
    • create a TMDB account here
    • then go here and follow the steps
  • rename the file .env_sample in .env
  • open the .env file
  • replace the placeholder with your api key

Credit

  • to Cinetel/Anica, where you can find most of the data that i reworked
  • to SIAE, where you can find other data that i reworked
  • to Istat, where you can find the shapefiles of Italy
  • to TMDB API, for the movies/actors data