Skip to content

Choose a list of countries for update #60

Closed
spin0us opened this Issue May 29, 2013 · 13 comments

7 participants

@spin0us
spin0us commented May 29, 2013

In the installation wiki it is specified that we can limit the update source range using some local settings like

 @define('CONST_Replication_Url', 'http://download.geofabrik.de/europe/france-updates');
 @define('CONST_Replication_MaxInterval', '259200');     // Process up to 3 days updates in a run
 @define('CONST_Replication_Update_Interval', '86400');  // How often upstream publishes diffs
 @define('CONST_Replication_Recheck_Interval', '900');   // How long to sleep if no update found yet

Is there any way to specify multiple country ou region ?
In fact i only need some countries in the update process.

@lonvia
Collaborator
lonvia commented May 29, 2013

You should be able to apply several of the geofabrik country updates but you will have to write your own script for that, you cannot use --import-osmosis-all. Your script should for each country download the latest diff from geofabrik (either use osmosis or get it manually once a day) and run ./utils/update.php --import-diff <difffile>. Once all countries are imported, run ./utils/update.php --index and you are done.

@spin0us
spin0us commented May 29, 2013

Do you know where can i find the "OSM minutely replication sequence number" ?
If i understand, i need to get this sequence number to retrieve each country diff update file and import each one using the command you specified. But i don't see, where i can get this current replication sequence number.

@lonvia
Collaborator
lonvia commented May 29, 2013

You'll find the current one in the state.txt file in the root directory of each update directory, e.g. for France, it would be in http://download.geofabrik.de/europe/france-updates/state.txt.

Instead of getting it manually, you are probably better off using osmosis mechanism for replication diffs. Once set up, it will take care of finding the correct file. Simply follow the instructions as described here, use one working directory per country and don't forget to change the base directory in each configuration.txt. Browse the update directories on the geofabrik server to find the right state file (look at the creation date).

@twain47
Owner
twain47 commented May 29, 2013

Just look through the directory on the server for the state file that was created the day before the OSM file you created (to give a bit of overlap) and use that state file.

So if your import was from the 1st of May you'd look here:
http://download.geofabrik.de/europe/ireland-and-northern-ireland-updates/000/000/
and use 000000058 (or even 000000057 if you wanted to be careful)

Probably safest to maintain one state per country / extract. The number might not be guaranteed to stay in sync.

@spin0us
spin0us commented May 30, 2013

I've write this small piece of bash script to do the job

#!/bin/bash

### Country list
COUNTRIES="europe/isle-of-man europe/kosovo"
NOMINATIM="/var/Nominatim"

### Foreach country check if configuration exists (if not create one) and then import the diff
for COUNTRY in $COUNTRIES;
do
    DIR="$NOMINATIM/updates/$COUNTRY"
    FILE="$DIR/configuration.txt"
    if [ ! -f ${FILE} ];
    then
        /bin/mkdir -p ${DIR}
        /usr/bin/osmosis --rrii workingDirectory=${DIR}/.
        /bin/echo baseUrl=http://download.geofabrik.de/${COUNTRY}-updates > ${FILE}
        /bin/echo maxInterval = 0 >> ${FILE}
        cd ${DIR}
        /usr/bin/wget http://download.geofabrik.de/${COUNTRY}-updates/state.txt
    fi
    FILENAME=${COUNTRY//[\/]/_}
    /usr/bin/osmosis --rri workingDirectory=${DIR}/. --wxc ${FILENAME}.osc.gz
done

INDEX=0 # false

### Foreach diff files do the import
cd ${NOMINATIM}/updates
for OSC in *.osc.gz;
do
    ${NOMINATIM}/utils/update.php --import-diff ${NOMINATIM}/updates/${OSC} --no-npi
    INDEX=1
done

### Re-index if needed
if ((${INDEX}));
then
    ${NOMINATIM}/utils/update.php --index
fi

### Remove all diff files
rm -f ${NOMINATIM}/updates/*.osc.gz

Does it seem to be ok ?
I've tested it with success but don't know if using --no-npi is a good choice.

@lonvia
Collaborator
lonvia commented Jun 6, 2013

It looks good to me. You can leave out the --no-npi but it won't do any harm either.

@lonvia
Collaborator
lonvia commented Jun 6, 2013

Considering this solved. I've linked to this issue from the Wiki for future reference.

@lonvia lonvia closed this Jun 6, 2013
@cinch
cinch commented Dec 9, 2013

FYI: if you're getting OSC updates from geofabrik from multiple countries: this can be problematic. those updates are intended for one country only, not mixing multiple together. you can do it, but it could result in, for example, streets cut off.
a solution to this is to use the changesets from entire Europe or the planet. this will make your maps slowly grow with the changes that have been added (it's not required to have all the countries loaded for this to work). if needed, you can then trim the map afterwards (by bounding box) for particular countries to save disk space.

@norcis
norcis commented Feb 18, 2014

Could anybody make a better instructions how to set up Nominatim for multiple countries (not all Planet) for newbies like me? What I understand from Cinch comment that using script above is not good and data can become wrong.

@cinch
cinch commented Feb 20, 2014

yes, from what i've gathered from various people (even from geofabrik directly): if you're using countries in Europe from geofabrik, then apply the updates of entire europe. that's how i did it. applying the diffs takes longer and uses more disk space, but it's safer. if the hard drive get's too full from all the updates, then you can safely "trim" the countries that you're not interested in. i think osmosis can do the trimming, but i haven't tried it yet. good luck :)

@lonvia
Collaborator
lonvia commented Feb 20, 2014

Ways in Geofabrik's diffs are always complete and relations are member-complete (see here), which is sufficient for Nominatim. There are two problematic cases, you might encounter: (1) some ways from a boundary relation are missing. This won't cause Nominatim to behave badly, it just means that these relations will be missing in the DB as well. Note, however, that this is already a problem in the original extracts, so if that matters to you, you should have used the Europe extract to begin with. (2) a way or alone-standing node might get moved out of one of the extracts. That means it will appear as a delete operation in one of the diffs, so that the way/node disappears from the DB as well. I suspect that this case is rare enough that it can be ignored, but that depends on your use case.

In summary: use exactly the diffs of the extracts you have initially imported with @spin0us script above.

@mdeweerd

This is an old closed thread, but I think that visitors can still benefit from the updated script for which I created a gist:
https://gist.github.com/mdeweerd/9bc5f60f2d6733e907f3

It did not seem to work for me, so I made some updates. Eventually they show that some other issue might have been going on.

@mtmail
Collaborator
mtmail commented Sep 29, 2015

Thank you! Really helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.