Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Detached] Make node-geolite2 a redistribution of MaxMind's databases #19

Closed
wants to merge 17 commits into from

Conversation

@GitSquared
Copy link
Contributor

GitSquared commented Jan 3, 2020

Following the discussion in #17.

Roadmap before mergeable:

  • Track databases on GitHub
  • Rewrite postinstall script
  • Write new lib License
  • Setup automatic updates, "backend-side" (see this comment) either with Travis or GitHub Actions, and a maintainer's MaxMind license
  • Setup automatic updates, "client-side" - make the library keep the downloaded databases up to date (I added checksum validation which could be useful for this)
  • Send an email to MaxMind's legal to make sure they're OK with what we're doing (should be, but won't hurt to ask)
  • Update README
  • Write new tests
  • Change the download URL to point to this repo instead of my fork Detach fork and publish under new name to NPM

I'll keep working on this but I wanted to share my progress so far and get @runk's suggestions if any.


Close #17

package.json Outdated Show resolved Hide resolved
GitSquared added 4 commits Jan 4, 2020
@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 4, 2020

So I implemented a self-updating mechanism to automatically update the local databases copies from the git repo. It uses SHA-384 checksums to determine whether a newer version is available. It's capable of keeping database copies up to date transparently, even when the library is used in a long-running script like a server or something, which should help ensure that nobody keeps using old versions of the databases past the allowed upgrade period of 30 days (see Do Not Sell requests and relevant sections in MaxMind's EULA).

I had to make a major API change however: the maxmind module is now incorporated in geolite2.

Old API:

var geolite2 = require('geolite2');
var maxmind = require('maxmind');

var lookup = maxmind.openSync(geolite2.paths.city); // or geolite2.paths.country or geolite2.paths.asn
var city = lookup.get('66.6.44.4');

Proposed API (this PR):

var geolite2 = require('geolite2');

var lookup = geolite2.openSync('GeoLite2-City'); // or -Country or -ASN
// an asynchronous geolite2.open() method is also available and returns a Promise
var city = lookup.get('66.6.44.4');
Implementation details geolite2 returns a Proxy wrapper around a maxmind Reader instance. When a property of the reader is accessed, a background routine starts (either once every 48 hours or the first time the reader is accessed) to fetch the latest databases checksums from the git mirror.

If any of the checksums differ from the local copy, fresh copies of the databases are downloaded, and their checksums matched again. Normally they should now be correct, so new maxmind Reader instances are spun up, loading up in memory the updated data, and they replace the old ones.

If the checksums don't match after a fresh download, or for any reason anything fucks up, a warning is printed to stdout and the whole self-update shebang will run again with the usual trigger (48 hours or program restart).

CI is broken again because the tests are now irrelevant and need to be rewritten.


Setup automatic updates, "backend-side" (see this comment) either with Travis or GitHub Actions, and a maintainer's MaxMind license

@runk I'm going to need your guidance/help for this part; as this involves maintainer secrets and CI configuration that I cannot change within the scope of a PR.

Once this is done we just have to verify the redistribution terms with MaxMind and, assuming all parties involved are OK, this will be mergeable.

Copy link
Owner

runk left a comment

Thanks for the PR! Lots of new code!!

Not sure I'm sold on tightly coupling this module with specific lookup library. Can you see alternative options?

index.js Outdated Show resolved Hide resolved
scripts/download-helper.js Outdated Show resolved Hide resolved
scripts/download-helper.js Outdated Show resolved Hide resolved
scripts/download-helper.js Outdated Show resolved Hide resolved
scripts/download-helper.js Outdated Show resolved Hide resolved
scripts/download-helper.js Outdated Show resolved Hide resolved
scripts/download-helper.js Outdated Show resolved Hide resolved
scripts/download-helper.js Outdated Show resolved Hide resolved
@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 6, 2020

Thanks for taking the time to review this - lots of new code indeed.
Regarding the CI/backend stuff, I'm assuming we'd rather go with Travis since it's already set up - is that okay?

I'll write a bash script to be run as a cron job. Need to look into how to push from CI.

@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 8, 2020

Update script written. Ci-independent for now, I'll add that part later on.

❯ env MAXMIND_LICENSE_KEY=[key] ./update-databases

Downloading checksum for latest version of GeoLite2-ASN...
Downloading GeoLite2-ASN...
Verifying checksum...
Extracting database...
Calculating database checksum for redistribution...
Database GeoLite2-ASN successfully updated

Downloading checksum for latest version of GeoLite2-Country...
Downloading GeoLite2-Country...
Verifying checksum...
Extracting database...
Calculating database checksum for redistribution...
Database GeoLite2-Country successfully updated

Downloading checksum for latest version of GeoLite2-City...
Downloading GeoLite2-City...
Verifying checksum...
Extracting database...
Calculating database checksum for redistribution...
Database GeoLite2-City successfully updated

All databases processed
@runk

This comment has been minimized.

Copy link
Owner

runk commented Jan 8, 2020

FYI there's MAXMIND_LICENSE_KEY env var already setup on travis ci. Probably you already figured it :)

@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 8, 2020

Indeed, which is why I reused that env var in the bash script.
By the way, it checks current versions against the remote checksums before pulling updates, to ensure your license key isn't over-used and flagged as spam or whatever.

In other news, I have written a draft mail for MM's legal, if you're OK with that I'd like to send it to you first so you can review it.

@runk

This comment has been minimized.

Copy link
Owner

runk commented Jan 9, 2020

Sure 👍 My email is on the profile

GitSquared added 3 commits Jan 9, 2020
…artial updates
@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 9, 2020

The Travis cron job script is up.
@runk Once this merges, you'll need to setup a cron job on master and a $GITHUB_PUSH_TOKEN secret env var on Travis with a GitHub personal access token.

GitSquared added 4 commits Jan 9, 2020
Update README
@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 9, 2020

Readme documentation and tests have been written.
Screenshot_20200109_163459

@d1str0

This comment has been minimized.

Copy link

d1str0 commented Jan 9, 2020

I'm the maintainer of an open source project that also used to directly download these databases on install (directly from Maxmind, not via this repo). I would love to hear what the response from the legal team is as this is something my project will also have to deal with.

Cheers

@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 9, 2020

@d1str0 We have not contacted them yet. Runk and I are working in opposite timezones so things are going a bit slow.
You can subscribe to this thread for further updates.

@GitSquared GitSquared changed the title [WIP] Make node-geolite2 a redistribution of MaxMind's databases [Need approval] Make node-geolite2 a redistribution of MaxMind's databases Jan 9, 2020
@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 11, 2020

@runk Have you received my email?

@runk

This comment has been minimized.

Copy link
Owner

runk commented Jan 13, 2020

Hey @GitSquared, it was in trash - just found it.

After reading and looking at all the code I still feel a little uncomfortable redistributing databases like that. I really appreciate all the work you've done, however I don't think I'm feeling like accepting this change in the current repository. Whole legal deal makes me feel too uncomfortable.

That said, would you mind cloning/copying this repo and doing exactly what you want? I'm more than happy to add a reference to your repo from this one - no questions at all.

@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 13, 2020

it was in trash - just found it.

GMail hates personal mailservers. I'd appreciate if you could report it as "not spam". Thanks.

I understand your decision, thanks for reviewing this PR either way.
I will contact MaxMind myself and detach my fork is they give me a positive answer - I'll update this thread.

I know you're not a stranger to them - they reuse your reader lib in their official APIs - so I had wished that with you as a maintainer they would be more likely to try and find a way for the license to work. Guess it's time to test my street cred.

Edit: Revised email sent to MM. (ping @d1str0).

@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 20, 2020

I got an answer from MaxMind.

They provided a helpful link to a page deep inside their support Q&A along with the following statement:

If you provide attribution you can redistribute the GeoLite2 Databases without a commercial redistribution license.

If GeoLite2 will be distributed through your product, you need to ensure that you have communicated or provided a mechanism for your customers to use the most current version of the database, that databases are required to be destroyed 30 days after MaxMind releases a new version, and that do not sell notifications flow down to those who you’ve shared the database with.

They kept themselves in the clear by adding that they "are not in a position to give legal advice" - as far as I'm involved this is just lawyer speak to avoid any troubles.

So tl;dr they cool with it, especially since it self-updates.

I'll detach & publish my fork in the next few days.

@d1str0

This comment has been minimized.

Copy link

d1str0 commented Jan 20, 2020

@GitSquared even though I don't use Node or these libs in my project, I highly appreciate all the work you've put into this issue, both from a coding implementation and also handling communication with Maxmind. Kudos, mate.

@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 20, 2020

@d1stro Cheers. Sometimes I like to get shit done.

@GitSquared GitSquared changed the title [Need approval] Make node-geolite2 a redistribution of MaxMind's databases [Detached] Make node-geolite2 a redistribution of MaxMind's databases Jan 22, 2020
@GitSquared GitSquared closed this Jan 22, 2020
@GitSquared

This comment has been minimized.

Copy link
Contributor Author

GitSquared commented Jan 22, 2020

Package is live as geolite2-redist: npm | git

@runk feel free to mention it in this one's README. You can also prob close #17

@AnandChowdhary

This comment has been minimized.

Copy link

AnandChowdhary commented Jan 23, 2020

Thanks for all the hard work, @GitSquared and @runk!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.