Only allow one instance of the crawler to run at a time. #28

supersam654 · 2016-02-19T17:47:01Z

On the production server, we attempt to run the crawler every 2 minutes. Sometimes (usually), the crawler can't finish in two minutes. To deal with this, we should add a lock file that causes the crawler to immediately close if it detects it.

Specific Requirements:

When the crawler is started, check if the lock file exists. If it does, print a message saying it exists and exit.
- The message should also explain that the crawler didn't shutdown properly last time
- The message should also give a one-line command to delete the existing lock file (if you want to be super fancy, having the crawler take an extra argument to ignore the lock file would be cool too).
If it doesn't exist, make the lock file and then run the crawler normally.
When the crawler is done, delete the lock file.
The lock file can be placed in the root of the project folder or in the system tmp folder.
- In the project is easy for cross-platform support but then you need to add it to .gitignore
- In the system temp folder is probably cleaner but make sure it works on Windows.
If you want, not using a lock file at all in development mode (look for an environment variable that would only be set in production) works fine too.
Call the lock file something sane (like qdoc.lock)
The file can be empty or have some basic information like the time it was created.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only allow one instance of the crawler to run at a time. #28

Only allow one instance of the crawler to run at a time. #28

supersam654 commented Feb 19, 2016

Only allow one instance of the crawler to run at a time. #28

Only allow one instance of the crawler to run at a time. #28

Comments

supersam654 commented Feb 19, 2016