# All models

Here are all the endpoints for the appliacation. To test them before implementation.

## Imports

In [4]:
from langchain.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForCausalLM

In [2]:
# Models
EMBEDDING_MODEL_NAME = "thenlper/gte-small"
READER_MODEL_NAME = "Qwen/Qwen2.5-1.5B-Instruct"

## *get_context()*

In [None]:
def get_context(query, k=5):
    """ 
    Retrieves relevant context for a given query.

    Parameters:
    query (str): The input query for which context is needed.
    k (int, optional): The number of relevant context elements to retrieve (default is 5).

    Returns:
    list: A list containing relevant context elements.
    """

    # Load embeddings
    embedding_model = HuggingFaceEmbeddings(
        model_name=EMBEDDING_MODEL_NAME,
        multi_process=True,
        model_kwargs={"device": "cpu"},  # replace 'cpu' by 'cuda' if you have Nvidia gpu
        encode_kwargs={"normalize_embeddings": True},  # Set `True` for cosine similarity
    )
    KNOWLEDGE_VECTOR_DATABASE = FAISS.load_local("../outputs/rag_embeddings_thenlper_gte-small", embedding_model, allow_dangerous_deserialization=True)

    # Retrieve docs
    print(f"\nStarting retrieval for {query=}...")
    retrieved_docs = KNOWLEDGE_VECTOR_DATABASE.similarity_search(query=query, k=k)

    return retrieved_docs

## *get_ai_answer(question)*
```
Improovements:
- In its v2 an agent can check and validate the quality of an answer checking the sources. He can correct the question if it's not ok.
```


In [29]:
def get_ai_answer(question):
    """
    Generates an AI-generated response to a given question.

    Parameters:
    question (str): The input question for which an answer is needed.

    Returns:
    str: The generated response from the AI.
    """

    # Load reader model
    model = AutoModelForCausalLM.from_pretrained(READER_MODEL_NAME)
    tokenizer = AutoTokenizer.from_pretrained(READER_MODEL_NAME)
    READER_LLM = pipeline(
        model=model,
        tokenizer=tokenizer,
        task="text-generation",
        do_sample=True,
        temperature=0.2,
        repetition_penalty=1.1,
        return_full_text=False,
        max_new_tokens=500,
    )

    # Build prompt
    # prompt_in_chat_format = [
    # {
    #     "role": "system",
    #     "content": """Use the information contained in the context to provide a comprehensive answer to the question.  
    #     - Answer only the question asked, in a concise and relevant manner.  
    #     - Always cite the sources used by indicating their.  
    #     - Explain why each reference was used to support the answer.  
    #     - If the answer cannot be deduced from the context, do not provide one.
        
    #     Exemple:
    #     - The correct answer is ...
    #     - Reference sources used: explain each reference and why you use them.
    #     - If the question was a multiple choice, explain why the other choise are wrong.
    #     """,
    # },
    # {
    #     "role": "user",
    #     "content": """Context:
    # {context}
    # ---
    # Now here is the question you need to answer.

    # Question: {question}""",
    # },
    # ]
    prompt_in_chat_format_v2 = [
    {
        "role": "system",
        "content": """Use only the information contained in the provided context to generate a precise and relevant answer to the given question.  
        
        **General Rules:**  
        - Answer concisely and directly to the question.  
        - If the answer requires choosing from multiple options, explicitly state the correct answer.  
        - Always cite the sources used and explain their relevance to the answer.  
        - If the answer cannot be deduced from the context, explicitly state that it cannot be answered.  

        **For Multiple Choice Questions (MCQs):**  
        - Start your response with: **"The correct answer is: [option]"** (e.g., "The correct answer is: A.")  
        - Explain why this option is correct based on the provided context.  
        - Briefly justify why the other options are incorrect, if possible.  
        
        **Example of response format:**  
        - **The correct answer is: [option]**  
        - **Justification:** (Explain why this answer is correct, citing sources)  
        - **Why other options are incorrect:** (Briefly explain why the other options do not apply)  

        If the question is not a multiple-choice question, provide a direct and structured answer.  
        """,
    },
    {
        "role": "user",
        "content": """Context:  
    {context}  
    ---  
    Now, answer the following question.  

    **Question:** {question}""",
    }
    ]

    RAG_PROMPT_TEMPLATE = tokenizer.apply_chat_template(prompt_in_chat_format_v2, tokenize=False, add_generation_prompt=True)
    

    # Retrieve context
    retrieved_docs = get_context(question)
    context = "\nExtracted documents:\n"
    context += "".join([f'Content: {doc.page_content} \nSource: {doc.metadata['ref']}\n\n' for i, doc in enumerate(retrieved_docs)])
    
    # Add context to prompt
    final_prompt = RAG_PROMPT_TEMPLATE.format(question=question, context=context)
    
    # Redact an answer
    answer = READER_LLM(final_prompt)[0]["generated_text"]

    return answer, context

In [30]:
# Test function
question = """Your Client, A Inc, is a sub-licensee under European patent application EP-1. Can the sub-licence be recorded in the European Patent Register?
 
A    No, it is not possible to record sub-licences in the European Patent Register.
 
B    Yes, any sub-licence can be recorded in the European Patent Register.
 
C    Yes, provided the licensee granting the sub-licence has recorded its licence in the European Patent Register.
"""

answer, context = get_ai_answer(question)
print(f'--------------------- Question ---------------------\n\
    {question}\n\n\
    --------------------- Answer ---------------------\n\
    {answer}\n\nContext:{context}')

Device set to use cpu



Starting retrieval for query='Your Client, A Inc, is a sub-licensee under European patent application EP-1. Can the sub-licence be recorded in the European Patent Register?\n\nA    No, it is not possible to record sub-licences in the European Patent Register.\n\nB    Yes, any sub-licence can be recorded in the European Patent Register.\n\nC    Yes, provided the licensee granting the sub-licence has recorded its licence in the European Patent Register.\n'...
--------------------- Question ---------------------
    Your Client, A Inc, is a sub-licensee under European patent application EP-1. Can the sub-licence be recorded in the European Patent Register?

A    No, it is not possible to record sub-licences in the European Patent Register.

B    Yes, any sub-licence can be recorded in the European Patent Register.

C    Yes, provided the licensee granting the sub-licence has recorded its licence in the European Patent Register.


    --------------------- Answer ---------------------
   

### Tests *get_ai_answer()*

In [21]:
test_docs = [{'document': '2024 - EPAC_open et EPC_solution_open',
 'questions': [
     {  'question': """International application WO-X was filed at the EPO on 27 August 2024. No fees have
been paid.
1. What fees are due on filing for WO-X? Fee amounts need not be mentioned.
2. What is the time limit for paying these fees?
3. What happens if these fees are not paid within the time limit, and what can you do
about it?""",
         'answer': """Question 1
1. The fees due on filing are the filing fee (including page fees), the search fee and the
transmittal fee.
2. These fees are to be paid within one month of the date of receipt of the international
application, i.e. 27 September 2024.
3. The applicant is invited to pay the fees within one month of the date of the invitation.
The payment of fees in response to the invitation (under Rule 16bis PCT) may be
subjected by the receiving Office to the payment of a late payment fee, a fee
retained by the receiving Office in question. The late payment fee is 50% of the
international filing fee (without page fees)."""
     },
     {
         'question': """On 25 October 2019, the Spanish University Isabel II and the company Tomato Matters
filed a European patent application in Spanish, accompanied by a translation into English.
Tomato Matters employs more than 260 employees.
The University Isabel II has filed two patent applications with the EPO over the past five
years.
On 10 October 2024, Tomato Matters transfers its rights to Naranjas Navel, a company
which employs 9 members of staff and whose annual turnover is EUR 1 million.
Naranjas Navel has never filed any patent applications with the EPO.
In a communication from the EPO under Rule 71(3) EPC dated 10 October 2024, the
name of the applicants is given as: Isabe III (clerical error) and Tomato Matters.
1. What has to be done to obtain a Unitary Patent as soon as possible for Isabel II and
Naranjas Navel? Is it possible to benefit from the compensation scheme?
Please list the necessary steps at minimum cost. You should identify the fees that have
to be paid, but you do not need to specify their amounts.
2. Let us now suppose that the request for unitary effect has been refused. What is the
time limit for lodging an application to reverse this decision, and to whom should the
application be addressed?""",
         'answer': """Question 2
1. Necessary steps:
 Request for correction of the name of the applicant.
 Request to transfer the application, subject to the payment of an
administrative fee (0 euro if requested using MyEPO Portfolio).
 Declaration regarding requirements for a reduction of fees.
 Payment of reduced sixth renewal fee.
 Payment of reduced fee for grant and printing; filing of translations of the
claims in German and French.
 Once the decision for grant is issued, filing of request for unitary effect (in
English) with translation into any other EU official language.
 Not entitled to compensation for translation costs because Tomato Matters is
not an SME.
2. The action must be filed at the UPC within three weeks of the refusal (Rule 97.1
RoP UPC). The two-month time limit under Rule 88.1 RoP UPC is not applicable,
see Rule 85.2 RoP UPC."""
     },
     {
         'question': """European application EP1 was filed online on 2 September 2024 without claiming priority.
You realise today, 10 October 2024, that priority from CN1 filed in Chinese on 31 August
2023 was not claimed.
1. Explain why it is still possible to claim priority from CN1 and what steps must be taken.
2. On the same day, you realise that, despite all due care, you filed the description of
another application, instead of the priority application translated into English. It was
intended that EP1 should have the same content as CN1.
How can you correct this? What will be the effect on the filing date?
3. What is the consequence with regard to claiming priority from CN1? What action could
be taken?""",
         'answer': """Question 4
1. EP1 was filed within 12 months of CN1 (31 August 2024, extended to 2 September
2024), so priority can be added. The declaration of priority can be made up to 16
months from earliest priority date: 31 December 2024, extended to 2 January 2025.
An applicant wishing to claim priority must file a declaration of priority indicating:
i. the date of the previous application
ii. the state or WTO member in or for which it was filed
iii. the application number
2. The applicant may file of their own volition the correct description within two months
of filing (Rule 56a EPC): 2 November 2024, extended to 4 November 2024. Since
priority was not claimed on filing, Rule 56a(4) EPC does not apply. The application
is re-dated.
3. As the new filing date falls outside the 12-month priority period, a request for
re-establishment in respect of the priority period should also be filed, together with
reasons. And the fee should be paid."""
     }
 ]
},
{
    'document': '2024 - EPAC_MCQ et solution',
    'questions': [
        {
            'question': """1. An international application is published, together with the search report drawn up
by the CNIPA, 18 months + 1 day after the filing date (no priority claimed). The
application has no more than 35 pages and 15 claims.
Which statement reflects all actions the CN applicant needs to take for entry into the
EP phase 25 months after the filing date? A request for early processing has been filed.
A. Complete and file Form 1200 and pay the filing fee and the search fee
B. Complete and file Form 1200 and pay the filing fee, the search fee and the
renewal fee for the third year
C. Complete and file Form 1200, pay the filing fee and the search fee and appoint
a representative
D. None of the above statements""",
            'answer': """Question 1: D
Legal basis
Rule 159(1) EPC
The examination fee and designation fee must also be paid (six months after
publication of search report; payment cannot be postponed to 31 months if there is a
wish to enter into the European phase at 25 months).""",
        },
        {
            'question': """2. A European patent application was filed on 6 February 2023. The search, filing and
designation fees were paid within a month of the date of filing.
What is the latest point in time for withdrawing the application, if the applicant wishes
to obtain a refund of the designation fee?
A. Six months after the date of mention of the publication of the European search
report
B. Date of mention of the publication of the European search report
C. The designation fee was validly paid and can no longer be refunded
D. Date of the start of substantive examination""",
            'answer': """Question 2: B
Legal basis
Guidelines for Examination in the EPO, A-X, 5.2.2
“The designation fee falls due upon publication of the mention of the European
search report. It may be paid within six months of the mentioned date of publication
(Rules 39(1), 17(3) and 36(4)). Where paid before the due date, e.g. upon filing of the
application, the designation fee will however be retained by the EPO. These
payments will only be considered valid as from the due date, provided that the
amount paid corresponds to the amount payable on the due date (…)”""",
        },
        {
            'question': """3. On 10 October 2024, an applicant files a request for entry into the European phase,
together with a debit order, according to which the filing fee, the designation fee, the
examination fee and the renewal fee for the third year are to be debited from the
applicant’s deposit account. It is specified that the debit order is to be executed on
18 October 2024. On the evening of 10 October 2024, the applicant notices that the
renewal fee is not yet due and should not be debited from the deposit account.
What is the latest point in time for revoking the order to debit the renewal fee in Central
Fee Payment (CFP)?
A. 10 October 2024
B. 17 October 2024
C. 18 October 2024
D. A debit order cannot be revoked in part""",
            'answer': """Question 3: B
Legal basis
OJ EPO 2024, Supplementary Publication 2, Arrangements for deposit accounts
(ADA), 13.2
“13.2: A debit order with a deferred payment date (…) may be revoked in whole or in
part in Central Fee Payment until one day before the date specified as the execution
date at the latest.”""",
        },
    ]
}
]

In [31]:
import json
import os

for doc in test_docs:
    doc_name = doc['document']
    
    for QandR in doc['questions']:
        answer, context = get_ai_answer(QandR['question'])
        
        # Store the AI-generated answer in the question dictionary
        QandR['AIAnswer'] = {
            "answer": answer,
            "context": context
        }

        print(f"\n\n\n\nDocument: {doc_name}\n\n"
              f"Question:\n{QandR['question']}\n\n"
              f"RealAnswer:\n{QandR['answer']}\n\n"
              f"AIAnswer:\nAnswer:\n{answer}\nContext:{context}")

# Save results to a JSON file
path_to_save = './tests_get_ai_answer_promptv2.json'
with open(path_to_save, 'w', encoding='utf-8') as f:
    json.dump(test_docs, f, indent=4, ensure_ascii=False)

print(f"\nResults have been saved in {path_to_save}")

Device set to use cpu



Starting retrieval for query='International application WO-X was filed at the EPO on 27 August 2024. No fees have\nbeen paid.\n1. What fees are due on filing for WO-X? Fee amounts need not be mentioned.\n2. What is the time limit for paying these fees?\n3. What happens if these fees are not paid within the time limit, and what can you do\nabout it?'...




Document: 2024 - EPAC_open et EPC_solution_open

Question:
International application WO-X was filed at the EPO on 27 August 2024. No fees have
been paid.
1. What fees are due on filing for WO-X? Fee amounts need not be mentioned.
2. What is the time limit for paying these fees?
3. What happens if these fees are not paid within the time limit, and what can you do
about it?

RealAnswer:
Question 1
1. The fees due on filing are the filing fee (including page fees), the search fee and the
transmittal fee.
2. These fees are to be paid within one month of the date of receipt of the international
application, i.e. 27 September 2024.
3. The

Device set to use cpu



Starting retrieval for query='On 25 October 2019, the Spanish University Isabel II and the company Tomato Matters\nfiled a European patent application in Spanish, accompanied by a translation into English.\nTomato Matters employs more than 260 employees.\nThe University Isabel II has filed two patent applications with the EPO over the past five\nyears.\nOn 10 October 2024, Tomato Matters transfers its rights to Naranjas Navel, a company\nwhich employs 9 members of staff and whose annual turnover is EUR 1 million.\nNaranjas Navel has never filed any patent applications with the EPO.\nIn a communication from the EPO under Rule 71(3) EPC dated 10 October 2024, the\nname of the applicants is given as: Isabe III (clerical error) and Tomato Matters.\n1. What has to be done to obtain a Unitary Patent as soon as possible for Isabel II and\nNaranjas Navel? Is it possible to benefit from the compensation scheme?\nPlease list the necessary steps at minimum cost. You should identify the fees that

Device set to use cpu



Starting retrieval for query='European application EP1 was filed online on 2 September 2024 without claiming priority.\nYou realise today, 10 October 2024, that priority from CN1 filed in Chinese on 31 August\n2023 was not claimed.\n1. Explain why it is still possible to claim priority from CN1 and what steps must be taken.\n2. On the same day, you realise that, despite all due care, you filed the description of\nanother application, instead of the priority application translated into English. It was\nintended that EP1 should have the same content as CN1.\nHow can you correct this? What will be the effect on the filing date?\n3. What is the consequence with regard to claiming priority from CN1? What action could\nbe taken?'...




Document: 2024 - EPAC_open et EPC_solution_open

Question:
European application EP1 was filed online on 2 September 2024 without claiming priority.
You realise today, 10 October 2024, that priority from CN1 filed in Chinese on 31 August
2023 was not claimed.

Device set to use cpu



Starting retrieval for query='1. An international application is published, together with the search report drawn up\nby the CNIPA, 18 months + 1 day after the filing date (no priority claimed). The\napplication has no more than 35 pages and 15 claims.\nWhich statement reflects all actions the CN applicant needs to take for entry into the\nEP phase 25 months after the filing date? A request for early processing has been filed.\nA. Complete and file Form 1200 and pay the filing fee and the search fee\nB. Complete and file Form 1200 and pay the filing fee, the search fee and the\nrenewal fee for the third year\nC. Complete and file Form 1200, pay the filing fee and the search fee and appoint\na representative\nD. None of the above statements'...




Document: 2024 - EPAC_MCQ et solution

Question:
1. An international application is published, together with the search report drawn up
by the CNIPA, 18 months + 1 day after the filing date (no priority claimed). The
application has no more 

Device set to use cpu



Starting retrieval for query='2. A European patent application was filed on 6 February 2023. The search, filing and\ndesignation fees were paid within a month of the date of filing.\nWhat is the latest point in time for withdrawing the application, if the applicant wishes\nto obtain a refund of the designation fee?\nA. Six months after the date of mention of the publication of the European search\nreport\nB. Date of mention of the publication of the European search report\nC. The designation fee was validly paid and can no longer be refunded\nD. Date of the start of substantive examination'...




Document: 2024 - EPAC_MCQ et solution

Question:
2. A European patent application was filed on 6 February 2023. The search, filing and
designation fees were paid within a month of the date of filing.
What is the latest point in time for withdrawing the application, if the applicant wishes
to obtain a refund of the designation fee?
A. Six months after the date of mention of the publication of

Device set to use cpu



Starting retrieval for query='3. On 10 October 2024, an applicant files a request for entry into the European phase,\ntogether with a debit order, according to which the filing fee, the designation fee, the\nexamination fee and the renewal fee for the third year are to be debited from the\napplicant’s deposit account. It is specified that the debit order is to be executed on\n18 October 2024. On the evening of 10 October 2024, the applicant notices that the\nrenewal fee is not yet due and should not be debited from the deposit account.\nWhat is the latest point in time for revoking the order to debit the renewal fee in Central\nFee Payment (CFP)?\nA. 10 October 2024\nB. 17 October 2024\nC. 18 October 2024\nD. A debit order cannot be revoked in part'...




Document: 2024 - EPAC_MCQ et solution

Question:
3. On 10 October 2024, an applicant files a request for entry into the European phase,
together with a debit order, according to which the filing fee, the designation fee, the
examina

In [32]:
for doc in test_docs:
    doc_name = doc['document']
    for QandR in doc['questions']:
        print(f"\n\n\n\nDocument: {doc_name}\n\n"
              f"Question:\n{QandR['question']}\n\n"
              f"RealAnswer:\n{QandR['answer']}\n\n"
              f"AIAnswer:\nAnswer:\n{QandR['AIAnswer']['answer']}\nContext:{QandR['AIAnswer']['context']}")





Document: 2024 - EPAC_open et EPC_solution_open

Question:
International application WO-X was filed at the EPO on 27 August 2024. No fees have
been paid.
1. What fees are due on filing for WO-X? Fee amounts need not be mentioned.
2. What is the time limit for paying these fees?
3. What happens if these fees are not paid within the time limit, and what can you do
about it?

RealAnswer:
Question 1
1. The fees due on filing are the filing fee (including page fees), the search fee and the
transmittal fee.
2. These fees are to be paid within one month of the date of receipt of the international
application, i.e. 27 September 2024.
3. The applicant is invited to pay the fees within one month of the date of the invitation.
The payment of fees in response to the invitation (under Rule 16bis PCT) may be
subjected by the receiving Office to the payment of a late payment fee, a fee
retained by the receiving Office in question. The late payment fee is 50% of the
international filing fee (witho

# Not implemented yet

## *get_question(subcategory)*
```
def get_question(sub_categorie)
	Récupère dans la DB de MAGB 3 questions au hasard de la bonne catégorie.
	Récupère du contexte pour ces 3 questions.
	Donne tout à un LLM qui génère une question.
	Lorsqu’on génère une question on génère sa réponse par IA en suivant, le temps que l’user réponse, pour enregistrer ça en BDD et donner rapidement un feedback à l’user.
	Dans la v2 un agent vérifie la qualité des sources et corrige.
```

## *get_feedback(question, reponse_ia, reponse_user)*

```
def get_feedback(question, reponse_ia, reponse_user)
	On récupère du contexte pour toutes ces entrés.
	On génère avec notre LLM un feed_back expliquant pk l’user s’est trompé, qu’est ce qui était mauvais.
	On lui renvoie la vraie réponse de l’IA en plus de notre analyse.
```

## *free_speach(history)*

```
def free_speach(history)
	Là c’est la fonctionnalité ou l’user peut discuter avec le modèle après avoir reçu un feedback. Il prend en entrée tout l’historique de la conv, c’est à dire question, réponse user, réponse et analyse ia. 
	Lorsqu’un user lui envoie un message il récupère du contexte dans le RAG et lui répond.
```