In [100]:
!pip install openai
!pip install markitdown
!pip install annoy



In [101]:
from openai import OpenAI # OpenAI API
import json
import requests # to download some resources
import os # file operations
import numpy as np # linear algebra
import pandas  as pd # data processing
from markdown import markdown # to render markdown
from IPython.display import Markdown 
from markitdown import MarkItDown
import annoy # Approximate Nearest Neighbors Oh Yeah for fast searching 
import pickle
from annoy import AnnoyIndex
import requests

In [106]:
# There are some files we'll use today - this just downloads them

urls_to_download = [
    "https://raw.githubusercontent.com/jhellingsdata/RADataHub/main/misc/LLM_practical/capstone/documents/consultations/consultation_1.pdf",
    "https://raw.githubusercontent.com/jhellingsdata/RADataHub/main/misc/LLM_practical/capstone/documents/consultations/consultation_2.pdf",
    "https://raw.githubusercontent.com/jhellingsdata/RADataHub/main/misc/LLM_practical/capstone/documents/consultations/consultation_3.pdf",
    "https://raw.githubusercontent.com/jhellingsdata/RADataHub/main/misc/LLM_practical/capstone/documents/consultations/consultation_4.pdf",
    "https://raw.githubusercontent.com/jhellingsdata/RADataHub/main/misc/LLM_practical/capstone/documents/consultations/council_report.pdf",
]

# Download the files
for url in urls_to_download:
    response = requests.get(url)
    filename = url.split("/")[-1].split("?")[0]
    with open(filename, "wb") as f:
        f.write(response.content)
        print(f".", end="")


.....

# Interacting with LLMs

In this workshop we'll introduce programatic use of LLMs using the OpenAI API. We'll look at prompts, their structure, and how to parse documents.

First, we'll need to setup our openai client.

In [None]:

client = OpenAI(
    api_key=                                    # use your API key here
)

  api_key=open("/Users/finn/Documents/keys/openai").read()                                     # ... and use an api_key direct from OpenAI


</br></br>

We can now interact with the LLM with just a few lines of code.

In [None]:
prompt = "Write a haiku about the LSE centre building."

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a critically acclaimed poet."},
        {"role": "user", "content": prompt},
    ],
)

print(response.choices[0].message.content)

Glass and steel unite,  
Ideas spark in bright halls' glow—  
Knowledge weaves its thread.


</br></br>

## Structuring prompts

Let's take a closer look at the structure of the `messages` object that we're be using to interact with the LLM.

It's a list representing the conversation history. Each message is either:

</br>

- `system`: A structural message that defines the conversation context. It guides AI assistants' behavior, tone, and format. Optional.
-  `user`: A message from the user. The last message in the list must always be a user message but the conversation can inlcude multiple.
-  `assistant`: A message from the AI assistant. 

</br>


There's no need for the `assistant` messages to be authentic. You can write example messages yourself to guide the conversation or to provide a well completed example.

In [94]:

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a top chef."},
        {"role": "user", "content": "Provide me with an easy recipe to feed 4 people."},
        {"role": "assistant", "content": "Sure! Do you have any dietary restrictions or a preference for a type of cuisine?"},
        {"role": "user", "content": "I'm a vegetarian and interested in Italian cuisine."},
    ],
)

print(response.choices[0].message.content)

Great choice! Here’s a simple and delicious recipe for **Vegetarian Pasta Primavera** that serves 4 people. 

### Vegetarian Pasta Primavera

#### Ingredients:

- **12 oz (340 g) pasta:** (penne, fettuccine, or your favorite type)
- **2 tablespoons olive oil**
- **1 medium onion, thinly sliced**
- **2 cloves garlic, minced**
- **1 bell pepper (red or yellow), sliced**
- **1 zucchini, sliced**
- **1 cup cherry tomatoes, halved**
- **1 cup broccoli florets**
- **1 cup spinach leaves**
- **1 teaspoon dried oregano**
- **1 teaspoon dried basil**
- **Salt and pepper to taste**
- **1/2 cup grated Parmesan cheese (or vegan cheese for a vegan option)**
- **Fresh basil leaves for garnish (optional)**

#### Instructions:

1. **Cook the Pasta:**
   - In a large pot, bring salted water to a boil. Add the pasta and cook according to package instructions until al dente. Reserve about 1 cup of pasta water, then drain the pasta and set aside.

2. **Sauté the Vegetables:**
   - In a large skillet over 

### Exercise: Try customising the system instructions
Can you enforce a specific tone or style in the conversation by changing the system messages?
What happens if you add a system message that contradicts the user message, or specify a tone?

In [9]:

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": """
         




         """},
        {"role": "user", "content": "Provide me with an easy recipe to feed 4 people."},
        {"role": "assistant", "content": "Sure! Do you have any dietary restrictions or a preference for a type of cuisine?"},
        {"role": "user", "content": "I'm a vegetarian and interested in Italian cuisine."},
    ],
)

print(response.choices[0].message.content)

</br></br>
</br></br>

## Parsing sources: Identifying ingredients in a recipe

In this example, we'll use an LLM to parse a natural language recipe. We'll extract the ingredients, kitchen appliances required, cooking method and a suggestion for how to make it healthier.

</br></br>

In [95]:
recipe = """
# Miso-Butter Roast Chicken With Acorn Squash Panzanella

Pat chicken dry with paper towels, season all over with 2 tsp. salt, and tie legs together with kitchen twine. Let sit at room temperature 1 hour.
Meanwhile, halve squash and scoop out seeds. Run a vegetable peeler along ridges of squash halves to remove skin. Cut each half into \u00bd\"-thick wedges; arrange on a rimmed baking sheet.
Combine sage, rosemary, and 6 Tbsp. melted butter in a large bowl; pour half of mixture over squash on baking sheet. Sprinkle squash with allspice, red pepper flakes, and \u00bd tsp. salt and season with black pepper; toss to coat.
Add bread, apples, oil, and \u00bc tsp. salt to remaining herb butter in bowl; season with black pepper and toss to combine. Set aside.
Place onion and vinegar in a small bowl; season with salt and toss to coat. Let sit, tossing occasionally, until ready to serve.
Place a rack in middle and lower third of oven; preheat to 425\u00b0F. Mix miso and 3 Tbsp. room-temperature butter in a small bowl until smooth. Pat chicken dry with paper towels, then rub or brush all over with miso butter. Place chicken in a large cast-iron skillet and roast on middle rack until an instant-read thermometer inserted into the thickest part of breast registers 155\u00b0F, 50\u201360 minutes. (Temperature will climb to 165\u00b0F while chicken rests.) Let chicken rest in skillet at least 5 minutes, then transfer to a plate; reserve skillet.
Meanwhile, roast squash on lower rack until mostly tender, about 25 minutes. Remove from oven and scatter reserved bread mixture over, spreading into as even a layer as you can manage. Return to oven and roast until bread is golden brown and crisp and apples are tender, about 15 minutes. Remove from oven, drain pickled onions, and toss to combine. Transfer to a serving dish.
Using your fingers, mash flour and butter in a small bowl to combine.
Set reserved skillet with chicken drippings over medium heat. You should have about \u00bc cup, but a little over or under is all good. (If you have significantly more, drain off and set excess aside.) Add wine and cook, stirring often and scraping up any browned bits with a wooden spoon, until bits are loosened and wine is reduced by about half (you should be able to smell the wine), about 2 minutes. Add butter mixture; cook, stirring often, until a smooth paste forms, about 2 minutes. Add broth and any reserved drippings and cook, stirring constantly, until combined and thickened, 6\u20138 minutes. Remove from heat and stir in miso. Taste and season with salt and black pepper.
Serve chicken with gravy and squash panzanella alongside."""

In [96]:
prompt = "From this recipe, tell me what are the ingredients and kitchen appliances needed? Give me a suggestion of how to make it healthier. Recipe:"

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are an assistant that summarises recipes. Return brief bullet points for responses."},
        {"role": "user", "content": prompt + recipe},
    ]
)

Markdown(response.choices[0].message.content)

### Ingredients:
- Chicken 
- Salt (2 tsp + ½ tsp + for pickling onions)
- Kitchen twine
- Acorn squash
- Vegetable peeler
- Sage
- Rosemary
- Butter (6 Tbsp + 3 Tbsp)
- Allspice
- Red pepper flakes
- Black pepper
- Bread
- Apples
- Oil
- Onion
- Vinegar
- Miso
- Wine
- Broth

### Kitchen Appliances Needed:
- Oven
- Rimmed baking sheet
- Large bowl
- Small bowl
- Cast-iron skillet
- Instant-read thermometer
- Wooden spoon 

### Healthier Suggestion:
- Substitute butter with olive oil or a lower-fat spread for roasting the chicken and vegetables to reduce saturated fat.

</br></br>

## Structuring the prompt and response

This is great, but we've given the LLM normal unstructured text and got back only marginally more structured text. We can do better by giving the LLM instructions to return data in a structured format - JSON. This will allow us to easily extract the information we want from the response.

In [None]:
prompt = """From this recipe, extract the following details and return them in a structured JSON format:
1. A list of ingredients.
2. A list of kitchen appliances required.
3. A suggestion on how to make the recipe healthier.

The response **must** follow this exact JSON structure:
{
    "ingredients": ["List of ingredients as strings"],
    "kitchen_appliances": ["List of kitchen appliances as strings"],
    "health_suggestion": "A single string suggestion to make the recipe healthier"
}

Recipe:"""

response = client.chat.completions.create(
    model="gpt-4o-mini",
    response_format={"type":"json_object"},
    messages=[
        {"role": "system", "content": "You help structure recipes. Always return JSON responses following the specified format."},
        {"role": "user", "content": prompt + recipe},
    ]
)

# Print the JSON response
print(response.choices[0].message.content)

{
    "ingredients": [
        "1 whole chicken",
        "2 tsp salt",
        "Kitchen twine",
        "1 acorn squash",
        "6 Tbsp melted butter",
        "Sage",
        "Rosemary",
        "Allspice",
        "Red pepper flakes",
        "Black pepper",
        "Bread",
        "Apples",
        "Oil",
        "¼ tsp salt",
        "Onion",
        "Vinegar",
        "Miso",
        "3 Tbsp room-temperature butter",
        "¼ cup wine",
        "Broth"
    ],
    "kitchen_appliances": [
        "Oven",
        "Rimmed baking sheet",
        "Large bowl",
        "Small bowl",
        "Large cast-iron skillet",
        "Instant-read thermometer",
        "Wooden spoon"
    ],
    "health_suggestion": "Substitute butter with olive oil to reduce saturated fat."
}


The response we get back is still just text, but we can turn it into a JSON object with a few lines of code.

In [98]:
parsed_recipe = json.loads(response.choices[0].message.content)
parsed_recipe

{'ingredients': ['1 whole chicken',
  '2 tsp salt',
  'Kitchen twine',
  '1 acorn squash',
  '6 Tbsp melted butter',
  'Sage',
  'Rosemary',
  'Allspice',
  'Red pepper flakes',
  'Black pepper',
  'Bread',
  'Apples',
  'Oil',
  '¼ tsp salt',
  'Onion',
  'Vinegar',
  'Miso',
  '3 Tbsp room-temperature butter',
  '¼ cup wine',
  'Broth'],
 'kitchen_appliances': ['Oven',
  'Rimmed baking sheet',
  'Large bowl',
  'Small bowl',
  'Large cast-iron skillet',
  'Instant-read thermometer',
  'Wooden spoon'],
 'health_suggestion': 'Substitute butter with olive oil to reduce saturated fat.'}

### Exercise: Customise the prompt

What else can we learn about the recipe? Add more questions to the prompt to extract more information.

# Learning from documents
In the last example, we used the LLM to extract information from some text we supplied it with. 

We got it to reply with a structured response, which means we could run this code over and over again to extract information from many recipes.

What if we want to learn from a PDF, word doc or a webpage instead?

First, we have to get the text from the document. There are various options for this:
-  MarkItDown. An open-source tool from Microsoft that can handle a variety of formats.
- Format specific parsers, like 'docx' for Word documents
- 'PDFMiner' is a Python package that can extract text from PDFs
- Web scraping libraries like 'BeautifulSoup' for webpages

Today we'll use **MarkItDown**. It's a good all-rounder.

## Example: Planning application documents

I have a collection of documents about a planning application. There are some consultation responses and a council summary. 

I want to learn about consultees' opinions on the application and the project details.

</br></br>

We need to convert our documents into raw text that the LLM can understand. We'll use MarkItDown to get mark down from the PDF.

### Loading the documents



In [108]:
consulations_md = []

consulation_paths = [
    "consultation_1.pdf",
    "consultation_2.pdf",
    "consultation_3.pdf",
    "consultation_4.pdf",
]

for path in consulation_paths:
    md = MarkItDown() 
    converted_doc = md.convert(path) # Convert the PDF to markdown
    consulations_md.append(converted_doc.text_content) # Append the markdown text to the list


consulations_md[0][:1000] # Let's take a peek at the first 1000 characters of the first consultation

'Safety and Airspace Regulation Group\nFuture Safety\n\nStewart Heald\nSenior Consultant\nOsprey Consulting Services\nSuite 10, The Hub,\nFowler Ave,\nFarnborough\nGU14 7JP\n\n23 December 2020\nRef Windfarms/Clashindarroch II\n\nDear Stewart,\n\nProposed Clashindarroch II, Aberdeenshire, Wind Farm Obstacle Lighting Scheme\n\nYour Reference: 71434 002 dated 5 October 2020\n\nThank you for the above report in which you discuss proposed lighting schemes for\n\n1.\nthe Clashindarroch II wind farm.\n\nThe proposed development is a fourteen-turbine wind farm, near Huntly in\n\n2.\nAberdeenshire. The turbines are proposed have ground to tip heights of 180m, which brings\nthem into scope of the Air Navigation Order (ANO) Article 222 for aeronautical obstacle\nlighting.\n\nThe Air Navigation Order article 222 requires that all obstructions at a height of 150\n\n3.\nm or more above ground are fitted with medium intensity steady red lights positioned as\nclose as possible to the top of the obstac

</br></br>

## Parsing documents: whole document processing

We can pass the LLM the entire document and ask it a few questions. This is the simplest approach but we'll run into problems if the document is too long.

Thankfully, the docuemnts are short.

In [109]:

document_text = consulations_md[0]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    response_format={"type":"json_object"},
    messages=[
        {"role": "system", "content": """
         
         You are an assistant that summarises planning consultations. Be concise and rely only on the text content.

         Your task is to identify the party consulted, whether they object to the proposal, their reasons for objection, and any suggestions they made.

         You must follow this exact JSON structure:

            {
                "party": the name of the organisation or individual consulted,
                "objects" : bool, whether the consultee objects to the proposal,
                "reasons_for_objection": a list of reasons for the objection,
                "suggestions": a list of suggestions made by the consultee
            }

         
         """},
        {"role": "user", "content": document_text},
    ]
)

# Print the JSON response
print(response.choices[0].message.content)

{
    "party": "Safety and Airspace Regulation Group",
    "objects": true,
    "reasons_for_objection": [
        "Require medium intensity lighting for visible lighting",
        "Concerns regarding compliance with CAA obstacle lighting requirements",
        "Need for uniformity with ICAO standards"
    ],
    "suggestions": [
        "Fit medium intensity steady red lights on nacelles of specific turbines",
        "Use a second light on the nacelles to act as an alternate in case of failure",
        "Allow dimming of lights to 10% intensity in good visibility",
        "Install infra-red lights to MoD specification on perimeter turbines"
    ]
}


</br></br>

Wrapping this in a function, we can easily process multiple documents.

In [110]:
def parse_consultation(document_path):
    """
    (str) -> str, dict

    Extracts useful information about a planning consultation by converting it to markdown and requesting a 
    summary from the openAI API. Returns the markdown text and a JSON object with the summary.
    """

    # Convert the PDF to markdown
    md = MarkItDown()
    converted_doc = md.convert(document_path).text_content

    # Request a summary from the OpenAI API
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        response_format={"type":"json_object"},
        messages=[
            {"role": "system", "content": """
            
            You are an assistant that summarises planning consultations. Be concise and rely only on the text content.

            Your task is to identify the party consulted, whether they object to the proposal, their reasons for objection, and any suggestions they made.

            You must follow this exact JSON structure:

                {
                    "party": the name of the organisation or individual consulted,
                    "objects" : bool, whether the consultee objects to the proposal,
                    "reasons_for_objection": a list of reasons for the objection,
                    "suggestions": a list of suggestions made by the consultee
                }

            
            """},
            {"role": "user", "content": converted_doc},
        ]
    )

    return converted_doc, json.loads(response.choices[0].message.content) 

Let's try our function out:

In [111]:
os.makedirs("pased_consultations", exist_ok=True) # You should always save received data to avoid re-running the same code
                                                  # LLM requests cost money - so avoid redundancy

parsed_consultations = []

for i, path in enumerate(consulation_paths):
    print(f"Processing consultation {i+1}")
    md, summary = parse_consultation(path)
    with open(f"pased_consultations/consultation_{i+1}.md", "w") as f:
        json.dump(summary, f)
    parsed_consultations.append(summary)

Processing consultation 1
Processing consultation 2
Processing consultation 3
Processing consultation 4


## Parsing documents: chunking

Our consultations are short - just a couple of pages each - so we can process them in one go. 

</br></br>

Longer documents are more difficult:
- Cost. We are charged per token of input and output. Giving the LLM a long document can be expensive.
- Complexity. The LLM can struggle with long documents. It can lose track of the context and give irrelevant answers.
- Context windows. Models have a limited context window. These are typically quite long (e.g. GPT 4o's is 128K tokens/~100K words)

A solution is *chunking*. We can split the document into smaller sections then choose the most relevant ones to process.

First, let's load the document as before:

In [113]:
md = MarkItDown()
converted_doc = md.convert("council_report.pdf").text_content
converted_doc

'SCOTTISH BORDERS COUNCIL\n\nPLANNING AND BUILDING STANDARDS COMMITTEE\n\n6 MARCH 2017\n\nAPPLICATION FOR CONSENT UNDER SECTION 36 OF THE ELECTRICITY ACT\n1989\n\nITEM:\n\nREFERENCE NUMBER: 14/00530/S36\n\nOFFICER:\nWARD:\nPROPOSAL:\n\nSITE:\n\nAPPLICANT:\nAGENT:\n\nJulie Hayward\nHawick and Denholm\nErection  of  15  turbines  132  high  to  tip,  access  track,\ncompound,  permanent anemometer mast and 2 no borrow\npits\nLand North, South,  East  and West of Birneyknowe Cottage\nHawick\nBanks Renewables\nNone\n\n1.0\n\n1.1\n\nPURPOSE OF REPORT\n\nTo  advise  the  Scottish  Government  of  the  response  from  Scottish  Borders\nCouncil  on  the  application  by  Banks  Renewables  to  construct  a  15  turbine\nwind  farm  on  land  north,  south,    east    and  west  of  Birneyknowe  Cottage\nHawick.\n\n2.0\n\nPROCEDURE\n\n2.1\n\n2.2\n\n2.3\n\n3.0\n\n3.1\n\nScottish  Borders  Council  (SBC)  is  a  consultee  as  a  ‘relevant  planning\nauthority’.\n\nThe  views  of  SBC  will  be 

</br></br>

Next we should split it into short chunks. These should be long enough to contain a coherent section of text but short enough to avoid the problems we discussed earlier. There should also be some overlap between the chunks to ensure that the LLM has enough context to understand the text.

There are lots of ways to do this: we could split the text by paragraph, by sentence, or by a fixed number of words.

For simplicity, we'll just keep adding lines until we reach a certain length.

</br></br>

In [114]:
def chunk_markdown(md_text, max_chars=3000):
    """Chunks some markdown by adding new lines until exceeding max_chars.
       Each chunk includes the last line of the previous chunk."""
    
    lines = md_text.split("\n")  # Split into lines
    chunks = []
    current_chunk = []
    current_length = 0

    for i, line in enumerate(lines):
        # Always include the previous line for context
        if i > 0 and current_length + len(line) > max_chars:
            chunks.append("\n".join(current_chunk))  # Save the current chunk
            current_chunk = [lines[i-1]]  # Start new chunk with the preceding line
            current_length = len(lines[i-1])  # Reset length tracker

        current_chunk.append(line)
        current_length += len(line) + 1  # +1 for the newline character

    # Add the last chunk
    if current_chunk:
        chunks.append("\n".join(current_chunk))

    return chunks


Let's get the chunks for the document:

In [115]:
chunks = chunk_markdown(converted_doc, max_chars=3000)

and get embeddings for every chunk:

In [116]:
def get_embedding(text, model="text-embedding-3-small"):
    text = text.replace("\n", " ")
    return client.embeddings.create(input = [text], model=model).data[0].embedding

In [117]:
chunks_with_embeddings = []

for i, chunk in enumerate(chunks):
    print(f".", end="")
    embedding = get_embedding(chunk)
    chunks_with_embeddings.append({"chunk": i, "text": chunk, "embedding": embedding})

with open("chunks_with_embeddings.pkl", "wb") as f: # it's always good to save the data you've received
    pickle.dump(chunks_with_embeddings, f)

........................................

Next, we can get the embedding for our question, and find the most similar chunk to it.

We'll `annoy` to find the most relevant chunks to process for a given question.

In [118]:
question = "How many wind turbines are proposed?"
question_embedding = get_embedding(question)

# Define the Annoy index - the index is the data structure that will store the embeddings
embedding_dim = len(chunks_with_embeddings[0]["embedding"])  # Get vector size
annoy_index = AnnoyIndex(embedding_dim, "angular")  # Angular distance for similarity

# Add chunks to Annoy index
for item in chunks_with_embeddings:
    annoy_index.add_item(item["chunk"], item["embedding"])

# Build the index (the argument is the number of 'trees' - more trees = more accurate but slower)
annoy_index.build(10)

# Find the most similar chunk to the question
n_nearest = 3
nearest_chunks = annoy_index.get_nns_by_vector(question_embedding, n_nearest, search_k=-1, include_distances=False)

nearest_chunks

[2, 37, 3]

Let's take a look at the most relevant chunk it found:

In [119]:
chunks[nearest_chunks[0]]

'offices  and  welfare  facilities,  toilet  facilities  with  a  packaged  treatment\nsystem,  containerised  storage  areas,  parking  for  cars  and  construction\nvehicles and a bunded area for the storage of fuels;\n\n•  Nine water course crossings;\n\n•  Two borrow pits to provide stone for the development, to be reinstated\n\npost-construction.\n\n4.2\n\n4.3\n\n4.4\n\n5.0\n\nThe  development  would  have  an  18  month  construction  period.    The  wind\nfarm would have a 25 year operational life and a 12 month decommissioning\nperiod.\n\nThe turbines would be three bladed, 80m to hub, with a 104m rotor diameter\nand  52m  long  blades.    The  precise  model  would  be  selected  upon  consent\nbeing granted.   They  would  have  a  semi-matt  light grey finish  and  would  be\ncomputer  controlled  to  face  the  optimum  wind  direction.    The  proposal\nincludes  a  micro-siting  allowance  of  50m  for  the  turbines  and  associated\ninfrastructure  post  consent  follow

It looks like a good match! Let's pass it to the LLM to get some information. 

Not every chunk will be relevant, so we should check the response to make sure it's useful.

In [120]:
messages = [
    {"role": "system", "content": "You are an assistant that answers questions about council reports.  Not every question will be answerable. Follow the specified JSON structure exactly."},
    {"role": "user", "content": '''How many turbines are proposed for Birneyknowe? 
        Answer with the format {
        "answer_contained": bool, whether the text contains enough information to confidently answer the question
          "explanation": str, explanation of how you arrived at the answer 
     "evidence": str, a direct quote from the report that supports your answer
     "value": int, number of turbines proposed
        }
        '''},
]

responses = []

for i, chunk in enumerate(nearest_chunks):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages + [{"role": "assistant", "content": chunks[chunk]}]
    )
    responses.append(response.choices[0].message.content)

Do any of the chunks provide the answer?

In [121]:
responses = [json.loads(response) for response in responses]
responses

[{'answer_contained': False,
  'explanation': 'The text does not provide a specific number of turbines proposed for Birneyknowe. While it discusses details about the turbines, it lacks a quantitative figure.',
  'evidence': 'There is no direct quote from the report that specifies the number of turbines proposed for Birneyknowe.',
  'value': None},
 {'answer_contained': False,
  'explanation': 'The provided text discusses various environmental impacts and considerations regarding a proposed wind farm at Birneyknowe, but it does not specify the number of turbines proposed.',
  'evidence': '',
  'value': 0},
 {'answer_contained': True,
  'explanation': 'The report discusses the evolution of the design from an initial proposal of 20 turbines to the current proposal of 15 turbines. This indicates a definitive number for the current proposal.',
  'evidence': 'The design has evolved to the 15 turbines now proposed following feedback from consultees and a full technical appraisal.',
  'value':