Skip to content

Ilhasoft/chatguy-nlp

Repository files navigation

Chatguy 🤖

The powerful NLP's dataset creator.

CI Docker Image CI Coverage Status Coverage Status

Overview

Chatguy is a Brazilian Portuguese dataset generator. Through natural language processing techniques it is possible to generate sentences from synonymous words and entity recognition. Its main functions can be divided between generating synonymous words and generating phrases or sentences.

Requirements:

  • Python 3.8
  • Docker
  • Docker-compose
  • Docker swarm
  • Git

Get Started

  1. First, clone the repo:
    git clone https://github.com/Ilhasoft/chatguy-nlp.git

  2. Enter the main dir
    cd chatguy-nlp

  3. Create a new env
    conda create chatguy

  4. Activate env
    conda activate chatguy

  5. Run Docker file 'docker-compose.yml'
    sudo docker compose up -d

It is important that the execution dir is chatguy-nlp and that both docker and docker swarm are installed.
If there is an error with docker swarm, install it and start it with docker swarm init.
Then run the file docker-compose.yml again.

The docker compose file image will install all requirements and dependencies within the environment, as well as initialize the local postgres database.

Environment Variables

Variables Keys
POSTGRES_USER postgres
POSTGRES_PASSWORD docker
POSTGRES_HOST 127.0.0.1
POSTGRES_PORT 5432
POSTGRES_ADAPTER postgresql

After the creation of images by docker is finished, export the database credentials so that docker can connect to the application.
6. Export database credentials in terminal

export POSTGRES_USER=postgres
export POSTGRES_PASSWORD=docker
export POSTGRES_HOST=127.0.0.1
export POSTGRES_PORT=5432
export POSTGRES_ADAPTER=postgresql

API Reference

After installing and configuring the environment and dependencies, the application is ready to run.
The connection to the application is made by the FAST API by the following command:

uvicorn --host=0.0.0.0 app:router --reload

Chatguy has 3 main endpoints:

  • /suggest_word
  • /suggest_sentences
  • /store_corrections

All of them receive textual content in json format as input.

To perform and manage requests, the was used, but other platforms such as and similar can be used.

POST /suggest_word
Takes an input word and returns a list of synonymous words
json request:

{
	"texts": [
		{
			"word": "qual",
			"generate": true,
			"entity": false,
			"local": true
		},
		{
			"word": "o",
			"generate": true,
			"entity": false,
			"local": false
		},
		{
			"word": "limite",
			"generate": true,
			"entity": true,
			"local": false
		},
		{
			"word": "de",
			"generate": false,
			"entity": false,
			"local": false
		},
		{
			"word": "frases",
			"generate": true,
			"entity": false,
			"local": true
		}
	]
}

POST /suggest_sentence
Takes an input phrase and generates synonymous phrases based on tagged entities.
json request:

{
	"isquestion": true,
	"intent": "teste",
	"texts": [
		{
			"word": "existem",
			"generate": true,
			"entity": "existir",
			"suggestions": [
				"há",
				"existem"
			]
		},
		{
			"word": "muitas",
			"generate": true,
			"entity": false,
			"suggestions": [
				"diversas"
			]
		},
		{
			"word": "pessoas",
			"generate": true,
			"entity": "sujeito",
			"suggestions": [
				"homens",
				"mulheres",
				"crianças"
			]
		},
		{
			"word": "no",
			"generate": false,
			"entity": false,
			"suggestions": [
				"no"
			]
		},
		{
			"word": "mundo",
			"generate": true,
			"entity": false,
			"suggestions": [
				"planeta"
			]
		}
	]
}

POST /store_corrections
Performs a sentence correction in the database
json request:

{
	"texts": [
		[
			"olá tudo bem como você vai?1",
			"olá tudo bem como você vai?2",
			"olá tudo bem como você vai?3"
		],
		[
			"valeu demais, até!1",
			"valeu demais, até!2",
			"valeu demais, até!3"
		]
	]
}

Contributing

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request