Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure bot categories are consistent in bots.yml #5727

Closed
etienne-martin opened this issue Mar 27, 2018 · 4 comments
Closed

Make sure bot categories are consistent in bots.yml #5727

etienne-martin opened this issue Mar 27, 2018 · 4 comments

Comments

@etienne-martin
Copy link
Contributor

I've extracted a list of all the different bot categories from https://github.com/matomo-org/device-detector/blob/master/regexes/bots.yml:

  • Analytics SEO Crawler
  • Benchmark
  • Crawler
  • Feed Fetcher
  • Feed Parser
  • Read-it-later Service
  • Search bot
  • Search tools
  • Security Checker
  • Security search bot
  • Service Agent
  • Site Monitor
  • Social Media Agent
  • Validator
  • crawler

I think we should standardize this up.

For instance, we have a category called Feed Fetcher and another one called Feed Parser. To me there's no difference between the two. Maybe I'm missing something.

There is also a category called Security Checker and another one called Security search bot. We could rename these to Security instead of having both categories.

I've noticed that the category on line 615 is missing capitalization. This can cause problems with some software expecting the value to be capitalized:
https://github.com/matomo-org/device-detector/blob/master/regexes/bots.yml#L615

I don't have time to create a PR right now but that's something I can definitely work on in the future.

Let me know what you think.

@Findus23
Copy link
Member

Just for reference this script can be used for generating the list of categories:

<?php
require_once "vendor/autoload.php";

$string = file_get_contents("regexes/bots.yml");
$parser = new \DeviceDetector\Yaml\Symfony();
$yaml = $parser->parseFile("regexes/bots.yml");
$categories = [];
foreach ($yaml as $bot) {
    if (!empty($bot["category"])) {
        $categories[] = $bot["category"];
    }
}
var_dump(array_unique($categories));

@sgiehl
Copy link
Member

sgiehl commented Mar 27, 2018

@etienne-martin Sure. Feel free to create a PR for that as soon as you have some time. We could also add some simple tests to prove new records uses a predefined set of categories or similar

@etienne-martin
Copy link
Contributor Author

Adding tests for the categories would be a good thing. Even if we don't implement a list of predefined categories, we could at least make sure that they are capitalized correctly.

sanchezzzhak added a commit to sanchezzzhak/device-detector that referenced this issue Mar 16, 2021
sgiehl added a commit that referenced this issue Mar 17, 2021
issue #5727

Co-authored-by: Stefan Giehl <stefan@matomo.org>
@sanchezzzhak
Copy link
Collaborator

reason: we forgot to close this issue with the addition of this PR #6707

@mattab mattab changed the title Inconsistencies in bots.yml Make sure bot categories are consistent in bots.yml Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants