It's usually tricky to setup things to build, so use a docker container to generate the data/ directory.