# Model analysis and errors

This notebook is for collecting examples of model reasoning, but also errors. It is used by the group of ThoughSource to save and exchange findings.
Please document your examples exactly such that we can reproduce them.

In [1]:
from cot.dataloader import Collection
from cot.generate import generate_and_extract, FRAGMENTS

**Please copy the template below and then make an analysis or error report based on its structure.**

## Template for reports and reproducibility

In [None]:
# provide which dataset you loaded
collection = Collection(["worldtree"], verbose=False)

Loading worldtree...


In [None]:
# provide config file, also just copying it here without the comments is enough.
config={
    "idx_range":(0,1), # Determines which indices the generate_and_extract routine is applied to, Default: None (All items are used)
    "debug": True, # Determines whether an api is called or a mock is returned, used for debugging, Default: True (api is not used)
    "instruction_keys": ["qa-01"], # Determines which instructions are used from fragments.json, Default: None (All used)
    "cot_trigger_keys": ["kojima-01"], # Determines which cot triggers are used from fragments.json, Default: None (All are used)
    "answer_extraction_keys": ["kojima-01"], # Determines which answer extraction prompts are used from fragments.json, Default: None (All are used)
    "author" : "Konstantin", # Name of the person responsible for generation, Default: ""
    "api_service": "huggingface_hub", # Name of the API called ("openai", "huggingface_hub"), Default: "openai"
    "engine": "google/flan-t5-xl", # Name of the engine used (for "huggingface_hub" use for example "google/flan-t5-xl"), Default: "text-davinci-002"
    "temperature": 0, # Name of the person responsible for generation, Default: 0
    "max_tokens": 512, # Maximum lenght of output generated by the model, Default: 128
    "api_time_interval": 1.0, # Pause between two api calls in seconds, Default: 1.0
    "warn": False,
}

In [None]:
# generate examples
collection.generate("worldtree", config=config)

n_samples: 3, n_instruction_keys: 1, n_cot_trigger_keys: 1, n_answer_extraction_keys: 1


In [None]:
# save your relevent outputs by copying them to the markdown field below
# collection["worldtree"]["train"]['generated_cot'] 
collection["worldtree"]["train"][list(config['idx_range'])]['generated_cot']

[[{'annotation': [],
   'answers': [{'answer': '\nB) Earth rotates on its axis.',
     'answer_extraction': 'kojima-01',
     'correct_answer': None}],
   'api_service': 'openai',
   'author': 'Konstantin',
   'comment': '',
   'cot': '\nThe Sun appears to move across the sky every day.\n\nThe most likely explanation is that Earth rotates on its axis.',
   'cot_trigger': 'kojima-01',
   'date': '2022/11/17 10:45:43',
   'instruction': 'qa-01',
   'model': {'max_tokens': 512, 'name': 'text-davinci-002', 'temperature': 0},
   'templates_version': '0.01'}],
 []]

**Copy the relevant outputs here:**

{'annotation': [],
   'answers': [{'answer': '\nB) Earth rotates on its axis.',
     'answer_extraction': 'kojima-01',
     'correct_answer': None},
    {'answer': '\nB) Earth rotates on its axis.',
     'answer_extraction': 'kojima-02',
     'correct_answer': None}],
   'api_service': 'openai',
   'author': '',
   'comment': '',
   'cot': '\nThe Sun appears to move across the sky every day.\n\nThe most likely explanation is that Earth rotates on its axis.',
   'cot_trigger': 'kojima-01',
   'date': '2022/11/15 13:19:25',
   'instruction': 'qa-01',
   'model': {'max_tokens': 512, 'name': 'text-davinci-002', 'temperature': 0},
   'templates_version': '0.01'}

**Description of issue:**

This is the field for your interpretation or error report.

### Model uses B) C) D) in CoT

In [5]:
# provide which dataset you loaded
collection = Collection(["worldtree"], verbose=False)
collection = collection.select(split="train", random_samples=False)

Loading worldtree...


In [6]:
# provide config file, also just copying it here without the comments is enough.
config={
    "idx_range":(1722,1723),
    "debug": False,
    "instruction_keys": ['qa-01'],
    "cot_trigger_keys": ['kojima-01'],
    "answer_extraction_keys": ['kojima-A-D'],
    "api_service": "huggingface_hub", # Name of the API called ("openai", "huggingface_hub"), Default: "openai"
    "engine": "google/flan-t5-xl", # Name of the engine used (for "huggingface_hub" use for example "google/flan-t5-xl"), Default: "text-davinci-002"
}

In [7]:
# generate examples
collection.generate("worldtree", config=config)


            You are about to [1m call an external API [0m in total 2 times, which [1m may produce costs [0m.
            Number API calls for CoT generation: n_samples 1 * n_instruction_keys 1 * n_cot_trigger_keys 1
            Number API calls for answer extraction: n_samples 1 * n_instruction_keys 1 * n_cot_trigger_keys 1 * n_answer_extraction_keys 1
            Do you want to continue? y/n
            


In [8]:
# save your relevent outputs by copying them to the markdown field below
# collection["worldtree"]["train"]['generated_cot'] 
collection["worldtree"]["train"][list(config['idx_range'])]['generated_cot']

[[{'templates_version': '0.01',
   'instruction': 'qa-01',
   'cot_trigger': 'kojima-01',
   'prompt_text': "Answer the following question through step-by-step reasoning.\nSharpening a pencil and tearing paper are examples of physical changes. Which statement describes why these are physical changes?\nA) There is a change in how the objects are used.\nB) There is a change in the appearance of the objects.\nC) There is a change in the materials from which the objects are made.\nD) There is a change in both the appearance of the objects and the materials from which they are made.\nAnswer: Let's think step by step.\n",
   'answers': [{'answer_extraction': 'kojima-A-D',
     'answer_extraction_text': "Answer the following question through step-by-step reasoning.\nSharpening a pencil and tearing paper are examples of physical changes. Which statement describes why these are physical changes?\nA) There is a change in how the objects are used.\nB) There is a change in the appearance of the ob

**Copy the outputs here:**

[[{'templates_version': '0.01',
   'instruction': 'qa-01',
   'cot_trigger': 'kojima-01',
   'prompt_text': "Answer the following question through step-by-step reasoning.\nSharpening a pencil and tearing paper are examples of physical changes. Which statement describes why these are physical changes?\nA) There is a change in how the objects are used.\nB) There is a change in the appearance of the objects.\nC) There is a change in the materials from which the objects are made.\nD) There is a change in both the appearance of the objects and the materials from which they are made.\nAnswer: Let's think step by step.\n",
   'answers': [{'answer_extraction': 'kojima-A-D',
     'answer_extraction_text': "Answer the following question through step-by-step reasoning.\nSharpening a pencil and tearing paper are examples of physical changes. Which statement describes why these are physical changes?\nA) There is a change in how the objects are used.\nB) There is a change in the appearance of the objects.\nC) There is a change in the materials from which the objects are made.\nD) There is a change in both the appearance of the objects and the materials from which they are made.\nAnswer: Let's think step by step.\nB) There is a change in the appearance of the objects. C) There is a change in the materials from which the objects are made. D) There is a change in both the appearance of the objects and the materials from which they are made. So, the final answer is B.\nTherefore, among A through D, the answer is\n",
     'answer': 'B.',
     'correct_answer': None}],
   'cot': 'B) There is a change in the appearance of the objects. C) There is a change in the materials from which the objects are made. D) There is a change in both the appearance of the objects and the materials from which they are made. So, the final answer is B.',
   'author': '',
   'date': '2022/11/24 16:46:05',
   'api_service': 'huggingface_hub',
   'model': "{'name': 'google/flan-t5-xl', 'temperature': 0, 'max_tokens': 128}",
   'comment': '',
   'annotation': []}],
 []]

**Description of issue:**

The model uses the letters B) C) D) to structure its text in the CoT, which makes no sense.