-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monitoring web page changes and keeping track of changes. #4
Comments
I'm quiet new to Tosback2 so I can't comment here. |
Maybe @Vinnl knows? |
I wasn't involved with Tosback, but I looked into it a little bit. Actually, the README contains a pretty clear description (although the source code seems to have some old mess included): https://github.com/tosdr/tosback2 So basically, there's a configuration file for every website to crawl, where the location of the terms of service are defined, and optionally the DOM element that includes it. The software isn't that complicated, collecting the pages to monitor appears to be a manual process. Is that what you want to know? |
Yes it is exactly what I needed. The fog of confusion has been lifted and the README now makes sense! So on the server that's running this ruby script the output goes to a checked out version of the git repo and is regularly committed and pushed to github? Then the web frontend just piggy backs on the github diff view. I see that @JimmStout wrote most of the code! Can I ask if there has been a discussion on these design choices? Were there great ideas about where it could go? Are there other things out there that do similar things and that could be reused? I do like how this is such a small code base, but I'm curious about the context to help me decide if I should adopt it and work on it or expand my search. |
Seems like there's a workflow aspect (crawl_reviewed) that's not super clear. Maybe @hugoroy can shed some light from the user perspective? |
Sorry, this is a bit too old for me to remember. Last time I looked into the technical aspects was in 2013 (right before Snowden!) You may find some useful information on http://jimmstout.com/ (Jimm's blog) |
There would be quite a number of pages apart from terms of services that would need to be monitored for changes. I'm wondering how Tosback2 is doing it? @pde @pierreozoux @hugoroy
Example of such pages would be:
The text was updated successfully, but these errors were encountered: