Sentiment Analysis on spanish IMDbreviews using a Ruperta model

This repository is inspired in some Hugginface tutorials and contains notebooks to explore the text dataset and to train the model. There is only a notebook on how to upload the model to the Gugginface Hub.

In the main notebook we will describe the most relevant steps to train a Hugginface model in AWS SageMaker, showing how to deal with experiments and solving some of the problems when facing with custom models when using SageMaker script mode on. Some basics concepts on SageMaker will not be detailed in order to focus on the relevant concepts.

Following steps will be explained:

Create an Experiment and Trial to keep track of our experiments
Load the training data to our training instance and create train, validation and test dataset and upload to S3
Create the scripts to train our Huggingface model, a RoBERTa based model pretrained in a spanish corpus: RuPERTa.
Create an Estimator to train our model in a huggingface container in script mode
Download and deploy the trained model to make predictions
Create a Batch Transform job to make predictions for the test dataset

Finetune the RuPERTa model for a Sentiment Analysis task

On development Different models are under development

Problem description

On development

The data set

On development

Content

On development

Contributing

If you find some bug or typo, please let me know or fixit and push it to be analyzed.

License

These notebooks are under a public GNU License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
scripts		scripts
.gitignore		.gitignore
Create_repo_fromS3_to_hub.ipynb		Create_repo_fromS3_to_hub.ipynb
FineTuned-Ruperta-for-Sentiment-Analysis.ipynb		FineTuned-Ruperta-for-Sentiment-Analysis.ipynb
IMDb_reviews_sp_Data_Preprocessing.ipynb		IMDb_reviews_sp_Data_Preprocessing.ipynb
IMDb_reviews_spanish_EDA.ipynb		IMDb_reviews_spanish_EDA.ipynb
LICENSE		LICENSE
README.md		README.md
README.txt		README.txt
Ruperta_sentiment_analisis_SageMaker.ipynb		Ruperta_sentiment_analisis_SageMaker.ipynb
Ruperta_sentiment_analisis_to_hub_SageMaker.ipynb		Ruperta_sentiment_analisis_to_hub_SageMaker.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis on spanish IMDbreviews using a Ruperta model

Finetune the RuPERTa model for a Sentiment Analysis task

Problem description

The data set

Content

Contributing

License

About

Releases

Packages

Languages

License

edumunozsala/Sentiment_analysis_spanish_Hf_AWS

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis on spanish IMDbreviews using a Ruperta model

Finetune the RuPERTa model for a Sentiment Analysis task

Problem description

The data set

Content

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages