Text Classification project

In this project we are going to classify text using a neural network. Our approach is to use a supervised learning strategy. The development of this project was done with pure Golang and the Gonum package. Go Chi is used to serve the API.

Begining 🚀

These instructions allow you to obtain a copy of the project, which you can run on your local machine.

Pre-requirements 📋

go1.16.5  ->  https://golang.org/dl/
go-chi    ->  https://pkg.go.dev/github.com/go-chi/chi
gonum     ->  https://pkg.go.dev/gonum.org/v1/gonum

Motivation

Our task is to develop a bot with which people can communicate, the logic of the business is oriented to a service of taking requests from the clients of Los Gophers Hermanos restaurant (This is a Pun of Gopher the mascot of Go and Los Pollos Hermanos from Breaking Bad 😅 Say my name).

You can access the repository of the web application where we use this repo by clicking here.

And our live demo here

Network

In our neural network we want to classify greetings, likes, dislikes and the orders of our clients. That is why we will take a multiclass classification strategy, where given an entry it will be classified and it will be proposed to which class the sentence corresponds.

In the following image we have the classes that our neural network can classify.

Architecture

We are going to have 3 layers, one of them is the input layer, we will have a hidden layer and an output layer.

The following image shows our architecture.

Preprocessing

Since we are working with text, it is necessary to try to correct some of the sentences before handing them over to the neural network. Our strategy will be:

Make all the letters of the sentences lowercase
Replace the accents with the corresponding vowels
Replace some special characters.

Training

Our neural network was trained with the following parameters.

Neurons of the input layer: 41
Neurons of the hidden layer: 7
Output layer neurons: 7
Number of iterations: 5000
Learning rate: 0.05

What we did was tokenize each statement of the training data and insert them into a "Bag of words". This bag of words helps us to establish the model on which we are going to use each training instance and operate with them within the neural network.

If you want to know more about "Bag of words" click here.

The following image exemplifies the treatment that will be done to the sentences. They are cleaned and then according to the "Bag of words" we will model that sentence in a matrix, where each element is the number of occurrences of each word.

Then, in a matrix we are going to locate each training instance.

Activation functions

For our hidden layer we are going to use the ReLU function as a function. And for the output layer we will make use of the Sigmoid activation function.

Results

Here we have the confusion matrix

Routing

The endpoint through which it can be accessed to make a prediction is by the following.

Method: POST
Endpoint: /v1/predictions

Note. You can make a request to the following URL.

https://gopher.fabricioism.com/v1/predictions

You must send the following body of the request payload:

{
    "sentence": "your sentence"
}

And it is responded with the following JSON.

Status: 200

{
  "prediction": "(food,order,pizza)"
}

Deployment

I have put the image in docker so that you can quickly run your service and connect from your web application. I leave you here the docker commands. 🐳

docker push fabricioism/go-nn-text-classification:tagname

Follow this link

Further work

I think the performance could improve a lot if we do a deeper work on the processing, specifically on using the roots of the words. For example, the words "having" and "have" could be reduced to their stem "hav". This is one of the approaches that NLP does.

If you want to know more about Stemming click here

In particular I am interested in applying Porter's algorithm. Make its equivalent for sentences in Spanish.

You can read more about Porter's Algorithm here

I hope this whole project will help you. If you have feedback for improvement let me know, it's good to know the opinion of others.

Take care and do not drink while driving.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
assets		assets
net		net
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
chats.txt		chats.txt
go.mod		go.mod
go.sum		go.sum
main.go		main.go
model.json		model.json
predictions.go		predictions.go
test_data.csv		test_data.csv
train_data.csv		train_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Classification project

Begining 🚀

Pre-requirements 📋

Content 📘

Motivation

Network

Architecture

Preprocessing

Training

Activation functions

Results

Routing

Deployment

Further work

About

Releases

Packages

Languages

License

fabricioism/go-nn-text-classification

Folders and files

Latest commit

History

Repository files navigation

Text Classification project

Begining 🚀

Pre-requirements 📋

Content 📘

Motivation

Network

Architecture

Preprocessing

Training

Activation functions

Results

Routing

Deployment

Further work

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages