In [2]:
pip install transformers torch gradio python-docx

Collecting gradio
  Using cached gradio-4.32.1-py3-none-any.whl.metadata (15 kB)
Collecting altair<6.0,>=4.2.0 (from gradio)
  Using cached altair-5.3.0-py3-none-any.whl.metadata (9.2 kB)
Collecting fastapi (from gradio)
  Using cached fastapi-0.111.0-py3-none-any.whl.metadata (25 kB)
Collecting fastapi-cli>=0.0.2 (from fastapi->gradio)
  Using cached fastapi_cli-0.0.4-py3-none-any.whl.metadata (7.0 kB)
Using cached gradio-4.32.1-py3-none-any.whl (12.3 MB)
Using cached altair-5.3.0-py3-none-any.whl (857 kB)
Using cached fastapi-0.111.0-py3-none-any.whl (91 kB)
Using cached fastapi_cli-0.0.4-py3-none-any.whl (9.5 kB)
Installing collected packages: fastapi-cli, altair, fastapi, gradio
Successfully installed altair-5.3.0 fastapi-0.111.0 fastapi-cli-0.0.4 gradio-4.32.1
Note: you may need to restart the kernel to use updated packages.


In [3]:
import docx
from transformers import pipeline
import gradio as gr

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
# Função para ler um documento .docx e converter para texto
def docx_to_text(filename):
    doc = docx.Document(filename)
    fullText = []
    for para in doc.paragraphs:
        fullText.append(para.text)
    return '\n'.join(fullText)

In [5]:
# Carregar os documentos de política
text_vacation_policy = docx_to_text('Vacation Policy.docx')
text_professional_development = docx_to_text('Professional Development Policy.docx')
text_maternity_leave = docx_to_text('Maternity Leave Policy.docx')

In [6]:
# Concatenar todos os textos das políticas em um único corpus
corpus = text_vacation_policy + "\n" + text_professional_development + "\n" + text_maternity_leave

In [7]:
# Carregar o pipeline de question-answering
qa_pipeline = pipeline('question-answering', model='distilbert-base-uncased-distilled-squad')

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


In [8]:
# Função para responder perguntas utilizando o pipeline de QA
def answer_question(question):
    result = qa_pipeline({'context': corpus, 'question': question})
    return result['answer']

In [9]:
# Testar a função de resposta com algumas perguntas
questions_to_test = [
    "What is the duration of maternity leave?",
    "When can maternity leave start?",
    "What is the objective of the Professional Development Policy?",
    "How are vacations divided?",
    "What is the remuneration during maternity leave?",
    "What does the Vacation Policy aim to ensure?",
]

for question in questions_to_test:
    print(f"Question: {question}")
    print(f"Answer: {answer_question(question)}\n")

Question: What is the duration of maternity leave?
Answer: 120 days

Question: When can maternity leave start?
Answer: within the period of 28 days before delivery

Question: What is the objective of the Professional Development Policy?
Answer: Encourage the professional growth of employees through opportunities for learning and career development

Question: How are vacations divided?
Answer: up to three periods

Question: What is the remuneration during maternity leave?
Answer: the employee will receive their full salary

Question: What does the Vacation Policy aim to ensure?
Answer: all employees have adequate periods of rest



In [10]:
# Definir a interface do Gradio
interface = gr.Interface(
    fn=answer_question,
    inputs="text",
    outputs="text",
    title="HR Policy Chatbot",
    description="Ask questions about the company's HR policies."
)

In [11]:
# Rodar a interface
interface.launch()

Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.


