Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when replacing yesterday's geodata (Extract.writer.error [Error: ENOENT: no such file or directory, lstat '/tmp/geonames/cities/cities1000.txt']) #39

Closed
tkafka opened this issue Mar 19, 2021 · 5 comments · Fixed by #40

Comments

@tkafka
Copy link
Contributor

tkafka commented Mar 19, 2021

Reproduction steps:

  1. Run local-reverse-geocoder, let it create files with dates in name (eg. cities/cities1000_2021-03-19.txt)
  2. Wait another day (or rename file to older date, eg. cities/cities1000_2021-03-19.txtcities/cities1000_2021-03-15.txt)
  3. Start local-reverse-geocoder again.

What I expected:

Geocoder updates its files.

What actually happens

Geocoder crashes with Extract.writer.error [Error: ENOENT: no such file or directory, lstat '/tmp/geonames/cities/cities1000.txt'].

The cause is this:

  • we call unzip.Extract
  • Extract extracts files from downloaded zip into a folder and emits close
  • We rename and unlink
  • Extract uses fstream module, and it calls stat on extracted file after calling close - see code!
  • Calling stat on now nonexistent file causes crash (error is emitted from fstream, and not handled by anyone).

Fix ideas:

  1. Fix the underlying stream libs (fstream shouldn't touch the extracted file after it passes close event)
  2. Ignore error event from unzip.Extract
extractStream
          .on('error', function (err) {
            if (err.code === 'ENOENT') {
              // ignore - fstream writer runs stat/lstat after extract is finished (and .close sent), and it crashes, because we have already used and moved/deleted it's file
            } else {
              throw err
            }
          })
  1. Use unzip to extract just a single file we need, in memory, without writing to disk
  2. Keep the extracted file (eg. cities1000.txt), so that we allow fstream to touch it however it wants after the extraction.

Beware:

All _get*data methods have this problem, it needs to be fixed everywhere.

@tkafka
Copy link
Contributor Author

tkafka commented Mar 19, 2021

It is very curious that no one has reported this before, as it happens every time you don't clean up the data and restart local-reverse-geocoder. I may have discovered it by moving dumpDirectory to /tmp/geonames (which I believe should be a default place, the geo database shouldn't be just dumped inside node_modules).

@tomayac
Copy link
Owner

tomayac commented Mar 19, 2021

I think what you're asking for is making GEONAMES_DUMP user-configurable. I'm happy to merge a PR that adds this.

@tkafka
Copy link
Contributor Author

tkafka commented Mar 19, 2021

@tomayac Hi Thomas, that was already configurable - see here.

I dived into the error with streams, but I think that the real problem is that one of node-unzip-2, fstream, or their interactions is wrongly written. However, there was no need to actually store the unzipped files on a disk, so I forked local-reverse-geocoder to https://github.com/tkafka/local-reverse-geocoder and rewrote it to use unzip-stream to extract the wanted file from stream to disk.

See tkafka@6e96500

@tomayac
Copy link
Owner

tomayac commented Mar 22, 2021

@tomayac Hi Thomas, that was already configurable - see here.

Oh, sorry. It's been a while that I touched this code.

I dived into the error with streams, but I think that the real problem is that one of node-unzip-2, fstream, or their interactions is wrongly written. However, there was no need to actually store the unzipped files on a disk, so I forked local-reverse-geocoder to https://github.com/tkafka/local-reverse-geocoder and rewrote it to use unzip-stream to extract the wanted file from stream to disk.

Thanks for working on this, but…

See tkafka@6e96500

…I wonder why you went for forking this. I am and always was happy to merge PRs. I'd be happy to add you to the list of contributors.

@tkafka
Copy link
Contributor Author

tkafka commented Mar 23, 2021

@tomayac Please don't take it personally, I just needed it fixed quickly. I just created a pull request #40, and I will be happy to delete local-reverse-geocoder-2 from NPM after this fix is published.

I have been running it on my server for past few days, and finally the geocoder reloads smoothly without crash, even when server is rebooted a day after previous start.

Feel free to add me on a contributor list - thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants