Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Choose a list of countries for update #60

Closed
spin0us opened this issue May 29, 2013 · 18 comments
Closed

Choose a list of countries for update #60

spin0us opened this issue May 29, 2013 · 18 comments

Comments

@spin0us
Copy link
Contributor

spin0us commented May 29, 2013

In the installation wiki it is specified that we can limit the update source range using some local settings like

 @define('CONST_Replication_Url', 'http://download.geofabrik.de/europe/france-updates');
 @define('CONST_Replication_MaxInterval', '259200');     // Process up to 3 days updates in a run
 @define('CONST_Replication_Update_Interval', '86400');  // How often upstream publishes diffs
 @define('CONST_Replication_Recheck_Interval', '900');   // How long to sleep if no update found yet

Is there any way to specify multiple country ou region ?
In fact i only need some countries in the update process.

@lonvia
Copy link
Member

lonvia commented May 29, 2013

You should be able to apply several of the geofabrik country updates but you will have to write your own script for that, you cannot use --import-osmosis-all. Your script should for each country download the latest diff from geofabrik (either use osmosis or get it manually once a day) and run ./utils/update.php --import-diff <difffile>. Once all countries are imported, run ./utils/update.php --index and you are done.

@spin0us
Copy link
Contributor Author

spin0us commented May 29, 2013

Do you know where can i find the "OSM minutely replication sequence number" ?
If i understand, i need to get this sequence number to retrieve each country diff update file and import each one using the command you specified. But i don't see, where i can get this current replication sequence number.

@lonvia
Copy link
Member

lonvia commented May 29, 2013

You'll find the current one in the state.txt file in the root directory of each update directory, e.g. for France, it would be in http://download.geofabrik.de/europe/france-updates/state.txt.

Instead of getting it manually, you are probably better off using osmosis mechanism for replication diffs. Once set up, it will take care of finding the correct file. Simply follow the instructions as described here, use one working directory per country and don't forget to change the base directory in each configuration.txt. Browse the update directories on the geofabrik server to find the right state file (look at the creation date).

@twain47
Copy link
Collaborator

twain47 commented May 29, 2013

Just look through the directory on the server for the state file that was created the day before the OSM file you created (to give a bit of overlap) and use that state file.

So if your import was from the 1st of May you'd look here:
http://download.geofabrik.de/europe/ireland-and-northern-ireland-updates/000/000/
and use 000000058 (or even 000000057 if you wanted to be careful)

Probably safest to maintain one state per country / extract. The number might not be guaranteed to stay in sync.

@spin0us
Copy link
Contributor Author

spin0us commented May 30, 2013

I've write this small piece of bash script to do the job

#!/bin/bash

### Country list
COUNTRIES="europe/isle-of-man europe/kosovo"
NOMINATIM="/var/Nominatim"

### Foreach country check if configuration exists (if not create one) and then import the diff
for COUNTRY in $COUNTRIES;
do
    DIR="$NOMINATIM/updates/$COUNTRY"
    FILE="$DIR/configuration.txt"
    if [ ! -f ${FILE} ];
    then
        /bin/mkdir -p ${DIR}
        /usr/bin/osmosis --rrii workingDirectory=${DIR}/.
        /bin/echo baseUrl=http://download.geofabrik.de/${COUNTRY}-updates > ${FILE}
        /bin/echo maxInterval = 0 >> ${FILE}
        cd ${DIR}
        /usr/bin/wget http://download.geofabrik.de/${COUNTRY}-updates/state.txt
    fi
    FILENAME=${COUNTRY//[\/]/_}
    /usr/bin/osmosis --rri workingDirectory=${DIR}/. --wxc ${FILENAME}.osc.gz
done

INDEX=0 # false

### Foreach diff files do the import
cd ${NOMINATIM}/updates
for OSC in *.osc.gz;
do
    ${NOMINATIM}/utils/update.php --import-diff ${NOMINATIM}/updates/${OSC} --no-npi
    INDEX=1
done

### Re-index if needed
if ((${INDEX}));
then
    ${NOMINATIM}/utils/update.php --index
fi

### Remove all diff files
rm -f ${NOMINATIM}/updates/*.osc.gz

Does it seem to be ok ?
I've tested it with success but don't know if using --no-npi is a good choice.

@lonvia
Copy link
Member

lonvia commented Jun 6, 2013

It looks good to me. You can leave out the --no-npi but it won't do any harm either.

@lonvia
Copy link
Member

lonvia commented Jun 6, 2013

Considering this solved. I've linked to this issue from the Wiki for future reference.

@lonvia lonvia closed this as completed Jun 6, 2013
@Fiyorin
Copy link

Fiyorin commented Dec 9, 2013

FYI: if you're getting OSC updates from geofabrik from multiple countries: this can be problematic. those updates are intended for one country only, not mixing multiple together. you can do it, but it could result in, for example, streets cut off.
a solution to this is to use the changesets from entire Europe or the planet. this will make your maps slowly grow with the changes that have been added (it's not required to have all the countries loaded for this to work). if needed, you can then trim the map afterwards (by bounding box) for particular countries to save disk space.

@norcis
Copy link

norcis commented Feb 18, 2014

Could anybody make a better instructions how to set up Nominatim for multiple countries (not all Planet) for newbies like me? What I understand from Cinch comment that using script above is not good and data can become wrong.

@Fiyorin
Copy link

Fiyorin commented Feb 20, 2014

yes, from what i've gathered from various people (even from geofabrik directly): if you're using countries in Europe from geofabrik, then apply the updates of entire europe. that's how i did it. applying the diffs takes longer and uses more disk space, but it's safer. if the hard drive get's too full from all the updates, then you can safely "trim" the countries that you're not interested in. i think osmosis can do the trimming, but i haven't tried it yet. good luck :)

@lonvia
Copy link
Member

lonvia commented Feb 20, 2014

Ways in Geofabrik's diffs are always complete and relations are member-complete (see here), which is sufficient for Nominatim. There are two problematic cases, you might encounter: (1) some ways from a boundary relation are missing. This won't cause Nominatim to behave badly, it just means that these relations will be missing in the DB as well. Note, however, that this is already a problem in the original extracts, so if that matters to you, you should have used the Europe extract to begin with. (2) a way or alone-standing node might get moved out of one of the extracts. That means it will appear as a delete operation in one of the diffs, so that the way/node disappears from the DB as well. I suspect that this case is rare enough that it can be ignored, but that depends on your use case.

In summary: use exactly the diffs of the extracts you have initially imported with @spin0us script above.

@mdeweerd
Copy link

This is an old closed thread, but I think that visitors can still benefit from the updated script for which I created a gist:
https://gist.github.com/mdeweerd/9bc5f60f2d6733e907f3

It did not seem to work for me, so I made some updates. Eventually they show that some other issue might have been going on.

@mtmail
Copy link
Collaborator

mtmail commented Sep 29, 2015

Thank you! Really helpful.

@anthologist
Copy link

anthologist commented Jun 12, 2017

The script by @mdeweerd doesn't work, precisely the command
/usr/bin/osmosis --rri workingDirectory=${DIR}/. --wxc ${FILENAME}.osc.gz
seems to not create any osc.gz file.
Any solution to this?

EDIT: got it, you need to change that line to:
/usr/bin/osmosis --rri workingDirectory=${DIR}/. --wxc ${DIR}/${FILENAME}.osc.gz

otherwise it won't find anything.

@RhinoDevel
Copy link
Contributor

Thanks to @spin0us, @mdeweerd and everybody else!

Here is an updated version that works with Nominatim 3.2.0, uses HTTPS and has some other tiny improvements (less hard-coded stuff, etc.):

https://gist.github.com/RhinoDevel/8a35ebd2a08166f328eca01ab005c6de

@ItBedna
Copy link

ItBedna commented Jan 4, 2020

It can also be used to update multiple diffs of the same region and then index all together? Or I need to use update.php --index after every imported diff?

@mtmail
Copy link
Collaborator

mtmail commented Jan 4, 2020

The original question was from 2013. Nominatim no longer uses osmosis, but osmium, so the instructions are likely outdated. http://nominatim.org/release-docs/latest/admin/Import-and-Update/#updates

Or I need to use update.php --index after every imported diff?

Importing marks places as unindexed (indexed_status > 1 in the placex table). The update.php --index then looks for those and processed them. So if you managed to import multiple countries/regions, then you only need to run update.php --index once. (Or rather the second time it won't any unindexed places).

@ItBedna
Copy link

ItBedna commented Jan 4, 2020

Thank's, but for example: I need import 480.osc.gz, 481.osc.gz and 482.osc.gz diffs from http://download.geofabrik.de/europe-updates/000/002/. Can I import all files without indexing each file and index everything at the end once?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants