In [12]:
youtube_template_string = """
You are an assistant in football (soccer) scouting, and provides answers to questions by using fact based information.
Use the following information to provide a concise answer to the question enclosed in <question> tags.
If you don't know the answer from the context, just say that you don't know

<context>
{context}
</context>

<question>
{question}
</question>

{format_instruction}
Do this format for every unique player id.
"""

In [13]:
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser

player_id_schema = ResponseSchema(name="player_id",
                                  description="ID of the player that the report is about which is an integer between 0 and 99999999")

report_summary_schema = ResponseSchema(name="report_summary",
                                  description="Summary of the report content about the player")

response_schemas = [player_id_schema, report_summary_schema]

output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
format_instruction = output_parser.get_format_instructions()

print(format_instruction)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"player_id": string  // ID of the player that the report is about which is an integer between 0 and 99999999
	"report_summary": string  // Summary of the report content about the player
}
```


In [21]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.llms import Ollama
from langchain_community.vectorstores import Milvus


EMBEDDING_MODEL = "nomic-embed-text"
MODEL = "mistral"

model = Ollama(model=MODEL)
embeddings = OllamaEmbeddings(model=EMBEDDING_MODEL)

COLLECTION_NAME = "scouting"
VECTOR_STORE_URI = "http://localhost:19530"


connection_args = {'uri': VECTOR_STORE_URI}
vectorstore = Milvus(
    embedding_function=embeddings,
    connection_args=connection_args,
    collection_name=COLLECTION_NAME,
    vector_field="embeddings",
    primary_field="id",
    auto_id=True
)

retriever = vectorstore.as_retriever(search_kwargs={'k': 3})

##### Dont have to edit anything below this to change models
def format_docs(docs):
    returnString = "\n\n".join(f"Player ID: {doc.metadata['player_transfermarkt_id']}, Report-Content: " + doc.page_content for doc in docs)
    print(returnString)
    return returnString


In [22]:
from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(template=youtube_template_string)

query = "I need a central defender who's very secure defensively"

retrieved_documents = retriever.invoke(query)

formatted_docs = format_docs(retrieved_documents)

messages = prompt.format_messages(context=formatted_docs, question=query, format_instruction=format_instruction)

messages

Player ID: 159687, Report-Content: 
 Played as a central defender and showed consistent performance in defense.
Was again dominant in aerial duels and allowed few chances.
His build-up play was secure, but he took no risks.
Showed good anticipation and was often faster than his opponent.
Speed remained an average aspect of his abilities.
At 30 years old, he remains a solid option in defense, but without great potential for progress.
 

Player ID: 159687, Report-Content: 
 As a central defender, he showed a strong physical presence.
Was dominant in aerial duels and cleared many dangerous situations.
In the build-up play, he was solid but not exceptionally creative.
Showed good anticipation and was often a step ahead of his opponent.
His speed was average, which could be problematic against fast forwards.
At 30 years old, probably not a long-term solution.
 

Player ID: 159264, Report-Content: 
 Played as a central defender and showed a strong performance.
Was very secure defensively and

[HumanMessage(content='\nYou are an assistant in football (soccer) scouting, and provides answers to questions by using fact based information.\nUse the following information to provide a concise answer to the question enclosed in <question> tags.\nIf you don\'t know the answer from the context, just say that you don\'t know\n\n<context>\nPlayer ID: 159687, Report-Content: \n Played as a central defender and showed consistent performance in defense.\nWas again dominant in aerial duels and allowed few chances.\nHis build-up play was secure, but he took no risks.\nShowed good anticipation and was often faster than his opponent.\nSpeed remained an average aspect of his abilities.\nAt 30 years old, he remains a solid option in defense, but without great potential for progress.\n \n\nPlayer ID: 159687, Report-Content: \n As a central defender, he showed a strong physical presence.\nWas dominant in aerial duels and cleared many dangerous situations.\nIn the build-up play, he was solid but no

In [17]:
print(messages[0].content)


You are an assistant in football (soccer) scouting, and provides answers to questions by using fact based information.
Use the following information to provide a concise answer to the question enclosed in <question> tags.
If you don't know the answer from the context, just say that you don't know

<context>
Player ID: 951357, Report-Content: 
 Played as a right back and showed a consistent performance over 90 minutes.
Was secure defensively and allowed few crosses.
Offensively, he set accents with precise passes and well-timed forward runs.
His stamina and commitment were remarkable.
However, he lacked top speed, which could be problematic against fast wingers.
At 25 years old, he is at a good age to improve further.
 

Player ID: 159753, Report-Content: 
 Played as a left back again with impressive speed and strong defensive positioning.
Had difficulties in ball handling under pressure, but was offensively dangerous with some good crosses.
His stamina allows him to perform consistent

In [23]:
response = model.invoke(messages)

response

' ```json\n{\n\t"player_id": "159264",\n\t"report_summary": "Very secure defensively, excellent positioning and wins many duels."\n}\n```'