Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


eurlex.js is a command line utility to retrieve documents (specifically: regulation drafts) in all supported languages from the EUR-Lex website and convert them into JSON. It is made with node.js and can be installed locally via npm.


eurlex.js can be installed using npm:

npm install -g eurlex

Of course you must have node node.js with npm installed.

eurlex.js works fine with Linux, *BSD and Darwin, but never was tested with Win32.


Once installed you can use eurlex on the command line:

eurlex [options] <EUR-Lex URI>

You get a brief description of all the options with

eurlex --help

If you are curious what it looks like to get and convert something, try:

eurlex -vu -l de,en,fr COM:2012:0011:FIN -o eurlex-com-2012-0011-fin.json


Since the HTML otuput of Eurlex is pretty far from being machine readable, eurlex.js applies a lot of magic to read it anyway. The magic can be fine tuned with setting in a file called profile.json. Here is a stripped and commented version of profile.json:

	"lang": ["en","de","..."],           // array of avalable languages
	"expressions": {                     // regular expressions
		"lang": "...",                   // to match the language of the document 
		"title": "..."                   // to match the title of the document
	"delimiters": {                      // delimiters (they are all regex)
		"en": {                          // for this language 
			"recitals": ["...","..."],   // start and end of recitals
			"articles": ["...","..."],   // start and end of articles
			"chapter": "^CHAPTER ",      // string to match a chapter
			"section": "^SECTION ",      // string to match a section
			"article": "^Article ",      // string to match an article
			"fixes": [                   // before a line is parsed
				["...","..."],           // .replace(/first/, "second")
				["...","..."]            // as many as you need
		"lv": {
			"recitals": ["...","..."],
			"articles": ["...","..."],
			"chapter": [                 // if this is an array
				"^([XVI]+) NODAĻA",      // if matches: chapter
				"^([XVI]+) NODAĻA$",     // if matches: text missing
				"^([XVI]+) NODAĻA (.*)$" // $1 is the literal, $2 is the text
			"section": [                 // same here...
				"^([0-9]+)\\. IEDAĻA", 
				"^([0-9]+)\\. IEDAĻA$", 
				"^([0-9]+)\\. IEDAĻA (.*)$"
			"article": [                 // note! for article[3] 
				"^([0-9]+)\\. pants",    // $1 is the literal, __$3__ is the text
				"^([0-9]+)\\. pants$", 
				"^([0-9]+)(\\.) pants (.*)$"
			"fixes": []                  // fixes indeed can be empty

Limitations & Known issues

  • In Magyar, paragraphs and points partly use the same literal enclosures, which leads to paragraphs will be interpreted as headless points. You should be safe using --unify with another language as first parameter.
  • The translations for Malti are formatted pretty crappy and have redundant fragments. You have to hardly rely on the fixes in your profile.json


eurlex.js is licensed under EUPL


Retrieve documents from EUR-Lex and convert them to usable data






No releases published


No packages published