Skip to content

Clojure(Script) library to identify crawler and bot user agent strings

License

Notifications You must be signed in to change notification settings

Olical/crawlers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crawlers Clojars Project

Clojure(Script) library to identify crawler and bot user agent strings. Relies on monperrus/crawler-user-agents for the regular expressions.

Usage

(require '[crawlers.detector :as crawlers])

;; This isn't a crawler.
(time (crawlers/crawler? "Mozilla/5.0 (X11; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"))
;; "Elapsed time: 0.214012 msecs"
;; => nil

;; This is!
(time (crawlers/crawler? "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Safari/537.36"))
;; "Elapsed time: 0.056963 msecs"
;; => "Googlebot/"

Updating

The list of expressions is fetched from crawler-user-agents.json so it needs to be occasionally synchronised.

You can submit an update to the regular expression list by executing clj -Afetch and pasting the resulting vector in src/crawlers/detector.cljc.

The last update was performed at Tue 23 Oct 11:46:14 BST 2018. Pull requests to update this are welcome.

Unlicenced

Find the full unlicense in the UNLICENSE file, but here's a snippet.

This is free and unencumbered software released into the public domain.

Anyone is free to copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.

Do what you want. Learn as much as you can. Unlicense more software.

About

Clojure(Script) library to identify crawler and bot user agent strings

Resources

License

Stars

Watchers

Forks

Packages

No packages published