### 1. Generate a PhantomWiki Instance

The first step is to generate a PhantomWiki instance by specifying the parameters of the universe you want to generate. This is done by first generating people connected by family relationships. We follow the family tree generator of Hohenecker & Lukasciewicz (2020). (TODO: cite)


You can specify the following parameters (for more please refer to our source code) for family tree generation: 

num-samples: the number of family trees in a PhantomWiki universe
max-tree-size: the maximum number of people in one family tree
max-branching-factor: the maximum depth that a family tree may have

In [None]:
num_samples = 1
max_tree_size = 25
max_branching_factor = 5

Specify the folder you want to store your PhantomWiki instance (including facts, QA pairs.)

In [None]:
output_dir = "out"

Now we can run the following command to generate a PhantomWiki universe. 

In [None]:
!python -m phantom_wiki --num-samples $num_samples --max-tree-size $max_tree_size --max-branching-factor $max_branching_factor --output-dir $output_dir --debug --valid-only

### 2. Visualization of the PhantomWiki universe

Now we have generated a universe with PhantomWiki stored in the `$output_dir` folder. 

#### 2.1 Visualization of family trees

We can first take a look at the family trees we generated. 

In [None]:
# By default we are showing the first family tree generated, although more may be generated at the last step.
family_tree_file = f"{output_dir}/family_tree_1.png"
from IPython.display import Image

Image(filename=family_tree_file)

Every person in PhantomWiki has a first name and a last name. (TODO: traditional family structure? ) Colors indicate the gender of people in the PhantomWiki universe. Arrows indicate parental relationship. 

#### 2.2 Generated Articles

The facts generated besides the family relationships include friend relationships, hobbies, occupations for the people in the universe. These facts are stored in `facts.pl` and used when converted into articles. 

(TODO: Do we want to show an example of how to query the prolog database?)

Those facts are converted into articles for everyone using pre-defined templates. 
Articles generated are saved in `articles` folder. Each person has a `$name.txt` file associated listing the related facts of this person. (For family relationships only the parents and siblings information are relected in the articles.)

Here we can take a look at an example of generated articles:


In [None]:
# input the name of the person you want to read the article about
name = "Aida Wang"
article_file = f"{output_dir}/articles/{name}.txt"
with open(article_file, "r") as f:
    article = f.read()
print(article)

#### 2.3 Generated QA pairs

Each PhantomWiki instance also contains Question-Answer pairs that are consistent with the generated facts. 

The difficulty of the generated questions is tunable via the `--depth` when running `python -m phantom_wiki` command above. by default, using `--depth 10` gives us `8` types of question templates. The number of questions generated from each type of template can be specified via `--num-questions-per-type` (default is 10). These questions are stored in `questions` folder arraged by type. 

Let's now look at some of the questions: 

In [None]:
# specify the type of question you want to look at
type = 0
question_file = f"{output_dir}/questions/type{type}.json"
import json

with open(question_file, "r") as f:
    questions = json.load(f)

we can look at the result of a sampled question and its answer along with the original question template:

In [None]:
question = questions[0]
print("Question: ", question["question"])
print("Answer: ", question["answer"])
print("Template: ", question["template"])
# print("Prolog:", question['prolog'])

### 3. Evaluation on a PhantomWiki instance

#### 3.1 Run evaluation with specific model and method

Now we can finally run evaluation on the generated PhantomWiki dataset. As an example, we test the `zeroshot` method using a llama model `meta-llama/llama-3.2-1b-instruct`

In [None]:
method = "zeroshot"
model = "meta-llama/llama-3.2-1b-instruct"
preds_dir = f"{output_dir}/preds"

In [None]:
# TODO: either add command to upload to huggingface or add code to run locally
# TODO: how to unify with the eval bash script
!python -m phantom_eval --method $method -od $preds_dir -m $model

#### 3.2 Visulize the results