Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possibility to monitor a list of URLs instead of single URL #15

Closed
coveritytest opened this issue Jan 25, 2021 · 4 comments
Closed

Comments

@coveritytest
Copy link
Contributor

It would be nice, if you could add a file with a list of URLs that get checked instead of a single URL.

@muety
Copy link
Owner

muety commented Jan 25, 2021

Good idea! How would you want to proceed with tolerance and XPath parameters? Would you suggest to configure them separate for each URL (see below) or globally for all? What would be appropriate for your use case?

# (url, tolerance, xpath)
https://url1.com, 75, /body
https://url2.com, 0, //div[name="article-container"]
...

@coveritytest
Copy link
Contributor Author

coveritytest commented Jan 25, 2021

Separate for each URL would be nice of course :) What do you think of json as input?

{
  "url_list": [
    {
      "url": "https://url1.com",
      "tolerance": 75,
      "xpath": "/body"
    },
    {
      "url": "https://url2.com",
      "tolerance": 0,
      "xpath": "//div[name='article-container']"
    }
  ]
}

@muety
Copy link
Owner

muety commented Jan 25, 2021

A JSON-formatted scrape config like the one you suggested seems to be a good option.

I'm thinking about keeping the logic for performing multiple scrapes outside of the actual watcher and provide a Bash script instead. It would read JSON config, iterate the array of URLs and parameters and call watcher.py for each entry. Doing so wouldn't add additional complexity to the actual code and have control logic separated from business logic. The only drawback would be the fact that Windows users wouldn't be able to use that functionality unless we also provide a Batch- or PowerShell script.

@coveritytest
Copy link
Contributor Author

coveritytest commented Jan 25, 2021

Yeah that's a good point, might be better to do it this way. Maybe then CSV would be easier to parse in a Bash script. On the other hand, with tools like jq JSON shouldn't be a problem either.

@muety muety closed this as completed in 0d00ba1 Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants