Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add option to valhalla_build_extract to use existing tar file for lookup/copy #4255

Merged
merged 15 commits into from
Aug 29, 2023

Conversation

nilsnolde
Copy link
Member

When running an instance of Valhalla, what one usually has is the tar file, and often no plain tile directory. Then valhalla_build_extract can't be used, which is a pity. At least for the public instance, I regularly have that "problem" and I don't want to un-tar those 80 GB before I can rip a few out.

This PR adds support to build extracts from tar files.

@@ -53,6 +53,38 @@ class TileHeader(ctypes.Structure):
]


class TileResolver:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a wrapper so we don't have to care about whether we're pulling tiles from a dir or tar

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like this construct. Usually I'd do a base class and 2 inherited ones, but for brevity's sake I did just this.

Comment on lines 108 to 113
parser.add_argument(
"-O",
"--overwrite",
help="Overwrites an output tar file",
action="store_true"
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has nothing to do with the actual PR's changes, it's just smth we should do as small protection

Comment on lines 101 to 107
parser.add_argument(
"-e",
"--extract-tar",
help="If specified, will build an extract from an existing tar file at 'mjolnir.tile_extract' and save it to this specified path.",
type=str,
default=""
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if someone specifies e.g. -e new_tar.tar, we'd write the result to ./new_tar.tar instead of what specified in the config. the tar specified in the config is instead taken as input.

tiles_dir: Path = Path(config["mjolnir"].get("tile_dir", '/dev/null'))
if not tiles_dir.is_dir():
# optionally use the tile extract to find the tiles
if args.extract_tar and (tiles_extract_in := Path(config["mjolnir"].get("tile_extract"))).is_file():
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first time ever I used the cute "walrus" operator haha it's python3.8 (3.7 is EOL by now)

@nilsnolde
Copy link
Member Author

There's still some small stuff failing, will look at it tmrw.

@nilsnolde
Copy link
Member Author

nilsnolde commented Aug 16, 2023

For future reference, I should note how this should be used:

If you want to use valhalla_build_extract with a spatial filter, i.e. either with --bbox or --geojson-dir (also works without, but really doesn't make any sense), and you only have a .tar'd graph, you can say this to save the spatial extract .tar to ./path/to/input_graph.tar:

./scripts/valhalla_build_extrac -b <minx,miny,maxx,maxy> -e path/to/output_graph.tar -i '{"mjolnir": {"tile_extract": "path/to/input_graph.tar"}}' 



class TestBuildExtract(unittest.TestCase):
def test_tile_intersects_bbox(self):
# bogus tile dir
tile_dir = Path("/home/")
tile_dir = Path("/foo/")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/home is far away from a "bogus dir" haha

@@ -173,11 +223,9 @@ def get_tiles_with_bbox(all_tile_paths: List[Path], bbox_str: str, tiles_dir_: P
LOGGER.critical(f"Bbox invalid: {list(bbox)}")
sys.exit(1)

tile_paths_ = set()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this used to collect a set before. in python, that's an unordered structure, so really we were producing unordered tar files 🙄 my bad.. it's a list now and we deduplicate just before writing it to the tar

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for reference: we can get duplicate tiles for --geojson-dir where multiple polygons intersect the same tile.

kevinkreiser
kevinkreiser previously approved these changes Aug 28, 2023
@nilsnolde
Copy link
Member Author

can you re-approve again pls @kevinkreiser ? I added a changelog, so it dimissed.

@kevinkreiser kevinkreiser merged commit 9c8bd05 into master Aug 29, 2023
8 checks passed
@kevinkreiser kevinkreiser deleted the nn-extract-from-tar branch August 29, 2023 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants