Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add source www.filmtipset.se #97

Closed
Row opened this issue Oct 21, 2019 · 2 comments
Closed

Add source www.filmtipset.se #97

Row opened this issue Oct 21, 2019 · 2 comments

Comments

@Row
Copy link
Contributor

Row commented Oct 21, 2019

Add filmtipset.se as a source, later perhaps as destination.

Filmtipset.se has been one of Sweden's largest (and greatest) movie rating communities. But the site was recently relaunched by a new owner which did not know or cared what Filmtipset was about.

Here is some information about the site: https://sv.wikipedia.org/wiki/Filmtipset

Quote wikipedia (Lazy Google translate)

In November 2009, the Filmtipset had over 87,600 registered users and the database contained over 69,900 films and 18.7 million ratings. In July 2011, the Filmtipset had 103,400 users, 81,600 films and 23 million ratings. [18] In January 2017, the database contained over 120,000 registered users, 112,775 films and over 29 million ratings. In September 2019, the Filmtipset had 122,000 registered users, 123,500 films and 29.8 million ratings. [18]

@StegSchreck StegSchreck added this to Backlog in RatS via automation Oct 21, 2019
@Row
Copy link
Contributor Author

Row commented Oct 21, 2019

Some more information. Filmtipset.se has no API at the moment.
I think the data needed for export is public?
The ratings and rating date for each user can be found on an url below, where p is the pagination offset.
https://www.filmtipset.se/betyg/ExampleUserName?p=0

IMDB-id might be present at each movie page e.g.:
https://www.filmtipset.se/film/the-beach-bum

I might be able to help out with the request, but I need better information how to contribute.
How to setup the development environment, preferably via docker. Are there any good commits or code to look at?

@StegSchreck
Copy link
Owner

The recommended dev environment would be a virtual env.
There is a Dockfile present in the project though, that you can also use for local runs.

I would recommend to check out the other parsers and inserters to get a general idea. The last ones being implemented were about RottenTomatoes (see PR #94).

When a file download is offered, this would be preferred to increase the speed of parsing. Another option to have a look at is using Javascript calls from the selenium-controlled browser. This would avoid loading more data than actually needed. The remaining option would be web scraping using BeautifulSoup.

Row added a commit to Row/RatS that referenced this issue Oct 13, 2020
Row added a commit to Row/RatS that referenced this issue Oct 13, 2020
Row added a commit to Row/RatS that referenced this issue Oct 13, 2020
Row added a commit to Row/RatS that referenced this issue Oct 14, 2020
Row added a commit to Row/RatS that referenced this issue Oct 16, 2020
RatS automation moved this from Backlog to Done Oct 17, 2020
StegSchreck added a commit that referenced this issue Oct 17, 2020
@StegSchreck StegSchreck added this to the v0.13 milestone Oct 17, 2020
StegSchreck added a commit that referenced this issue Oct 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
RatS
  
Done
Development

No branches or pull requests

2 participants