Very many of the filenames for the data are hardcoded in, but they are not present in this repo because they are >40mb files, and all downloaded from nltk, in particular nltk.brown.
To do the generation task on your own, everything you need will be in generation/
To do the pos tag on your own, just go to pos/ and run test.py. You may want to replace the load=None argument with "gen_res" or "bigram" or "kron" (Parent) or "kron2" (Child) if you want to run it in <10minutes.