
# Notebook: Requisito Esforço e Estimativa de Story Points

## 1. Introdução
Neste notebook, abordamos a estimativa de Story Points a partir de descrições textuais
utilizando modelos baseados em NLP, incluindo BERT e outros modelos de ensemble.
A avaliação de desempenho é realizada utilizando métricas como MAE, R², RMSE e a correlação de Pearson.

## Principais Passos
- Preparação e pré-processamento dos dados.
- Treinamento de múltiplos modelos, incluindo ensemble.
- Avaliação dos modelos e comparação dos resultados.


## 2. Importações e Configurações

In [None]:
'''
%pip install langdetect
%pip install pandas
%pip install spacy
%pip install scikit-learn
%pip install bs4
%pip install nltk
%pip install xgboost
%pip install catboost
%pip install transformers torch
%pip install lightgbm
%pip install contractions
%pip install pyspellchecker
%pip install gensim
%pip install deslib
%pip install desReg
%pip install numpy
%pip install scipy
%pip install transformers[torch]
'''

In [1]:

# Importações das Bibliotecas
import importlib
import preprocessing
import bert_evaluation
import ensemble_models

importlib.reload(preprocessing)
importlib.reload(bert_evaluation)
importlib.reload(ensemble_models)

# Definição de Constantes Globais
VERSAO_NOME_BERT = "V_BERT"
VERSAO_NOME_ENSEMBLE = "V_ENSEMBLE"
DIRETORIO_DATASETS = 'D:\\Mestrado\\Python\\Projeto\\Datasets\\JIRA-Estimation-Prediction\\storypoint\\IEEE TSE2018\\dataset'
LIMITAR_QUANTIDADE_REGISTROS = False
QUANTIDADE_REGISTROS_SE_LIMITADO = 15
NOME_ARQUIVO_RESULTADOS = 'resultados_modelos.csv'


  from .autonotebook import tqdm as notebook_tqdm


## 3. Carregamento e Pré-Processamento dos Dados

In [2]:

# Carrega as bases de dados em uma lista
datasetsCarregados = preprocessing.carregar_todos_dados(DIRETORIO_DATASETS, LIMITAR_QUANTIDADE_REGISTROS, QUANTIDADE_REGISTROS_SE_LIMITADO)

# Geração de estatísticas do dataset
for dataset in datasetsCarregados:
    preprocessing.gerar_estatisticas_base(dataset)

# Inicia o processamento dos datasets da lista, tratando o texto de cada um deles
datasetsCarregados = preprocessing.preprocessar_todos_datasets(datasetsCarregados)
datasetsCarregados[0].head()


Base de Dados: jirasoftware.csv
Total de Registros: 234
Média de Palavras por Registro: 114.11
Desvio Padrão de Palavras por Registro: 103.28
Distribuição de Story Points (Fibonacci): {1: 25, 2: 51, 3: 54, 5: 60, 8: 28, 13: 10}




  descricao_processada = BeautifulSoup(
Processando Descrição: 100%|██████████| 234/234 [00:08<00:00, 28.65it/s]


Unnamed: 0,description,storypoint,issuekey,dataset_name,treated_description
2,Generic webwork aliases may clash with other p...,5,GHS-1681,jirasoftware.csv,generic webwork alias clash plugin web work ac...
4,"Add text to the Agile Gadget ""Invalid Project""...",2,GHS-1819,jirasoftware.csv,add text agile gadget invalid project message ...
6,Greenhopper ranking field is not displayed cor...,20,GHS-1882,jirasoftware.csv,greenhopper rank field display correctly confl...
8,Version can be set in the create issue screen ...,5,GHS-2047,jirasoftware.csv,version set create issue screen jira greenhopp...
10,As a user I would like the ability to see a ho...,13,GHS-2478,jirasoftware.csv,user like ability horizontal swimlane rapid bo...


## 4. Treinamento e Avaliação dos Modelos BERT

In [3]:
# Executa a Avaliação dos Modelos BERT e Similar
datasets = datasetsCarregados

# Executa o método que realiza o treinamento e teste encima dos datasets, utilizando diversos modelos'
resultados_finais, predicoes_por_modelo = bert_evaluation.avaliar_modelo_bert_em_datasets(datasets, VERSAO_NOME_BERT)


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.



Analisando Dataset 1 com o modelo bert-base-uncased: jirasoftware.csv

KFold file already exists for dataset jirasoftware.csv. Skipping creation.
KFold indices loaded from file for dataset jirasoftware.csv.


100%|██████████| 11/11 [22:10<00:00, 106.26s/it]
100%|██████████| 11/11 [22:41<00:00, 106.26s/it]

{'eval_loss': 16.083585739135742, 'eval_mae': 2.898913621902466, 'eval_r2': -1.0691996983892422, 'eval_rmse': 4.010434627532959, 'eval_pearson_corr': -0.5862537311029284, 'eval_runtime': 25.9016, 'eval_samples_per_second': 0.734, 'eval_steps_per_second': 0.077, 'epoch': 1.0}


100%|██████████| 11/11 [22:50<00:00, 124.60s/it]


{'train_runtime': 1370.5931, 'train_samples_per_second': 0.123, 'train_steps_per_second': 0.008, 'train_loss': 26.276980313387785, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.16s/it]
100%|██████████| 3/3 [00:41<00:00, 13.89s/it]
100%|██████████| 11/11 [24:49<00:00, 117.10s/it]
100%|██████████| 11/11 [25:21<00:00, 117.10s/it]

{'eval_loss': 11.022942543029785, 'eval_mae': 2.143430709838867, 'eval_r2': -0.2697135327936373, 'eval_rmse': 3.3200814723968506, 'eval_pearson_corr': 0.10876864926222492, 'eval_runtime': 26.2253, 'eval_samples_per_second': 0.724, 'eval_steps_per_second': 0.076, 'epoch': 1.0}


100%|██████████| 11/11 [25:26<00:00, 138.76s/it]


{'train_runtime': 1526.3754, 'train_samples_per_second': 0.11, 'train_steps_per_second': 0.007, 'train_loss': 22.574346368963067, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.09s/it]
100%|██████████| 3/3 [00:42<00:00, 14.14s/it]
100%|██████████| 11/11 [26:57<00:00, 129.26s/it]
100%|██████████| 11/11 [27:27<00:00, 129.26s/it]

{'eval_loss': 8.397050857543945, 'eval_mae': 1.7449437379837036, 'eval_r2': -0.06288058191108958, 'eval_rmse': 2.897766590118408, 'eval_pearson_corr': -0.03189456621767561, 'eval_runtime': 25.7063, 'eval_samples_per_second': 0.739, 'eval_steps_per_second': 0.078, 'epoch': 1.0}


100%|██████████| 11/11 [27:38<00:00, 150.80s/it]


{'train_runtime': 1658.8358, 'train_samples_per_second': 0.101, 'train_steps_per_second': 0.007, 'train_loss': 16.573171442205254, 'epoch': 1.0}


100%|██████████| 2/2 [00:05<00:00,  2.71s/it]
100%|██████████| 3/3 [00:41<00:00, 13.97s/it]
100%|██████████| 11/11 [30:20<00:00, 143.24s/it]
100%|██████████| 11/11 [31:03<00:00, 143.24s/it]

{'eval_loss': 11.748919486999512, 'eval_mae': 2.6047720909118652, 'eval_r2': -0.152543434980291, 'eval_rmse': 3.4276695251464844, 'eval_pearson_corr': 0.0614096634512309, 'eval_runtime': 25.9688, 'eval_samples_per_second': 0.732, 'eval_steps_per_second': 0.077, 'epoch': 1.0}


100%|██████████| 11/11 [31:09<00:00, 169.94s/it]


{'train_runtime': 1869.3044, 'train_samples_per_second': 0.09, 'train_steps_per_second': 0.006, 'train_loss': 13.19871798428622, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.02s/it]
100%|██████████| 3/3 [00:51<00:00, 17.26s/it]
                                                
100%|██████████| 11/11 [25:41<00:00, 120.12s/it]

{'eval_loss': 10.522127151489258, 'eval_mae': 2.5153732299804688, 'eval_r2': -0.2806769036811836, 'eval_rmse': 3.2437827587127686, 'eval_pearson_corr': -0.006051835110559643, 'eval_runtime': 26.7434, 'eval_samples_per_second': 0.71, 'eval_steps_per_second': 0.075, 'epoch': 1.0}


100%|██████████| 11/11 [26:00<00:00, 141.88s/it]


{'train_runtime': 1560.677, 'train_samples_per_second': 0.108, 'train_steps_per_second': 0.007, 'train_loss': 12.486792824485086, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.14s/it]
100%|██████████| 3/3 [00:40<00:00, 13.62s/it]
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at microsoft/codebert-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.



Analisando Dataset 1 com o modelo microsoft/codebert-base: jirasoftware.csv

KFold file already exists for dataset jirasoftware.csv. Skipping creation.
KFold indices loaded from file for dataset jirasoftware.csv.


100%|██████████| 11/11 [25:58<00:00, 122.16s/it]
100%|██████████| 11/11 [26:34<00:00, 122.16s/it]

{'eval_loss': 17.324739456176758, 'eval_mae': 3.103487014770508, 'eval_r2': -1.228877649805169, 'eval_rmse': 4.162299633026123, 'eval_pearson_corr': 0.07350519663806834, 'eval_runtime': 26.9195, 'eval_samples_per_second': 0.706, 'eval_steps_per_second': 0.074, 'epoch': 1.0}


100%|██████████| 11/11 [26:43<00:00, 145.77s/it]


{'train_runtime': 1603.5086, 'train_samples_per_second': 0.105, 'train_steps_per_second': 0.007, 'train_loss': 28.75669999556108, 'epoch': 1.0}


100%|██████████| 2/2 [00:03<00:00,  1.96s/it]
100%|██████████| 3/3 [00:42<00:00, 14.22s/it]
100%|██████████| 11/11 [26:45<00:00, 127.76s/it]
100%|██████████| 11/11 [27:37<00:00, 127.76s/it]

{'eval_loss': 9.218162536621094, 'eval_mae': 1.9649819135665894, 'eval_r2': -0.061824214504402564, 'eval_rmse': 3.0361428260803223, 'eval_pearson_corr': 0.42610675422328986, 'eval_runtime': 27.6926, 'eval_samples_per_second': 0.686, 'eval_steps_per_second': 0.072, 'epoch': 1.0}


100%|██████████| 11/11 [27:58<00:00, 152.63s/it]


{'train_runtime': 1678.9706, 'train_samples_per_second': 0.1, 'train_steps_per_second': 0.007, 'train_loss': 21.810896439985797, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.05s/it]
100%|██████████| 3/3 [00:41<00:00, 13.81s/it]
                                                
100%|██████████| 11/11 [28:54<00:00, 138.00s/it]

{'eval_loss': 8.057891845703125, 'eval_mae': 2.11419415473938, 'eval_r2': -0.019950655530418304, 'eval_rmse': 2.8386428356170654, 'eval_pearson_corr': -0.02997497164836991, 'eval_runtime': 29.0215, 'eval_samples_per_second': 0.655, 'eval_steps_per_second': 0.069, 'epoch': 1.0}


100%|██████████| 11/11 [29:07<00:00, 158.88s/it]


{'train_runtime': 1747.7161, 'train_samples_per_second': 0.096, 'train_steps_per_second': 0.006, 'train_loss': 14.297221790660512, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.11s/it]
100%|██████████| 3/3 [00:43<00:00, 14.36s/it]
                                                
100%|██████████| 11/11 [28:51<00:00, 137.20s/it]

{'eval_loss': 10.264676094055176, 'eval_mae': 2.5940663814544678, 'eval_r2': -0.0069423918439444066, 'eval_rmse': 3.2038533687591553, 'eval_pearson_corr': 0.25035344321058584, 'eval_runtime': 27.8144, 'eval_samples_per_second': 0.683, 'eval_steps_per_second': 0.072, 'epoch': 1.0}


100%|██████████| 11/11 [29:06<00:00, 158.79s/it]


{'train_runtime': 1746.658, 'train_samples_per_second': 0.096, 'train_steps_per_second': 0.006, 'train_loss': 12.362056385387074, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.10s/it]
100%|██████████| 3/3 [00:42<00:00, 14.21s/it]
                                                
100%|██████████| 11/11 [28:58<00:00, 136.71s/it]

{'eval_loss': 8.636096954345703, 'eval_mae': 2.1940925121307373, 'eval_r2': -0.05112310024211886, 'eval_rmse': 2.9387238025665283, 'eval_pearson_corr': 0.25626652492793894, 'eval_runtime': 27.3035, 'eval_samples_per_second': 0.696, 'eval_steps_per_second': 0.073, 'epoch': 1.0}


100%|██████████| 11/11 [29:10<00:00, 159.12s/it]


{'train_runtime': 1750.316, 'train_samples_per_second': 0.097, 'train_steps_per_second': 0.006, 'train_loss': 11.815364490855824, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.12s/it]
100%|██████████| 3/3 [00:41<00:00, 13.76s/it]
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.



Analisando Dataset 1 com o modelo FacebookAI/roberta-base: jirasoftware.csv

KFold file already exists for dataset jirasoftware.csv. Skipping creation.
KFold indices loaded from file for dataset jirasoftware.csv.


100%|██████████| 11/11 [27:40<00:00, 133.27s/it]
100%|██████████| 11/11 [28:20<00:00, 133.27s/it]

{'eval_loss': 17.12293815612793, 'eval_mae': 3.0655715465545654, 'eval_r2': -1.202915548042998, 'eval_rmse': 4.1379876136779785, 'eval_pearson_corr': 0.07849704357265257, 'eval_runtime': 28.5678, 'eval_samples_per_second': 0.665, 'eval_steps_per_second': 0.07, 'epoch': 1.0}


100%|██████████| 11/11 [28:29<00:00, 155.38s/it]


{'train_runtime': 1709.1957, 'train_samples_per_second': 0.098, 'train_steps_per_second': 0.006, 'train_loss': 28.469854181463067, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.10s/it]
100%|██████████| 3/3 [00:43<00:00, 14.43s/it]
                                                
100%|██████████| 11/11 [28:43<00:00, 133.70s/it]

{'eval_loss': 8.685417175292969, 'eval_mae': 2.0688483715057373, 'eval_r2': -0.00045818904649785885, 'eval_rmse': 2.947103261947632, 'eval_pearson_corr': 0.49417035969525114, 'eval_runtime': 27.1431, 'eval_samples_per_second': 0.7, 'eval_steps_per_second': 0.074, 'epoch': 1.0}


100%|██████████| 11/11 [29:07<00:00, 158.89s/it]


{'train_runtime': 1747.7382, 'train_samples_per_second': 0.096, 'train_steps_per_second': 0.006, 'train_loss': 21.01156893643466, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.01s/it]
100%|██████████| 3/3 [00:42<00:00, 14.22s/it]
100%|██████████| 11/11 [30:52<00:00, 143.61s/it]
100%|██████████| 11/11 [31:31<00:00, 143.61s/it]

{'eval_loss': 8.27646541595459, 'eval_mae': 2.221412181854248, 'eval_r2': -0.04761716739006183, 'eval_rmse': 2.876884698867798, 'eval_pearson_corr': -0.00268998333602756, 'eval_runtime': 27.7404, 'eval_samples_per_second': 0.685, 'eval_steps_per_second': 0.072, 'epoch': 1.0}


100%|██████████| 11/11 [31:41<00:00, 172.90s/it]


{'train_runtime': 1901.8608, 'train_samples_per_second': 0.088, 'train_steps_per_second': 0.006, 'train_loss': 14.126184636896307, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.02s/it]
100%|██████████| 3/3 [00:43<00:00, 14.55s/it]
100%|██████████| 11/11 [28:53<00:00, 136.95s/it]
100%|██████████| 11/11 [30:01<00:00, 136.95s/it]

{'eval_loss': 10.191326141357422, 'eval_mae': 2.6641640663146973, 'eval_r2': 0.0002530493841979009, 'eval_rmse': 3.192385673522949, 'eval_pearson_corr': 0.045716820528224233, 'eval_runtime': 27.7198, 'eval_samples_per_second': 0.685, 'eval_steps_per_second': 0.072, 'epoch': 1.0}


100%|██████████| 11/11 [30:09<00:00, 164.53s/it]


{'train_runtime': 1809.839, 'train_samples_per_second': 0.093, 'train_steps_per_second': 0.006, 'train_loss': 12.42038934881037, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.12s/it]
100%|██████████| 3/3 [00:42<00:00, 14.19s/it]
                                                
100%|██████████| 11/11 [31:45<00:00, 156.58s/it]

{'eval_loss': 10.050394058227539, 'eval_mae': 2.4653027057647705, 'eval_r2': -0.22326097905174302, 'eval_rmse': 3.1702353954315186, 'eval_pearson_corr': -0.22886306867583944, 'eval_runtime': 29.0863, 'eval_samples_per_second': 0.653, 'eval_steps_per_second': 0.069, 'epoch': 1.0}


100%|██████████| 11/11 [31:55<00:00, 174.15s/it]


{'train_runtime': 1915.6759, 'train_samples_per_second': 0.088, 'train_steps_per_second': 0.006, 'train_loss': 11.798769864169033, 'epoch': 1.0}


100%|██████████| 2/2 [00:04<00:00,  2.06s/it]
100%|██████████| 3/3 [00:41<00:00, 13.72s/it]


In [None]:
## Lista para armazenar os dados
#predicoes_lista = []

#for model_name, data in predicoes_por_modelo.items():
#    for issuekey, description, y_test, y_pred in zip(data['issuekeys'], data['descriptions'], data['y_test'], data['y_pred']):
#        predicoes_lista.append({
#            'Modelo': model_name,
#            'IssueKey': issuekey,
#            'Descrição': description,
#            'Valor Real': y_test,
#            'Valor Predito': y_pred
#        })

## Converter a lista em um DataFrame do pandas
#df_predicoes = pd.DataFrame(predicoes_lista)

## Exportar para CSV
#df_predicoes.to_csv('predicoes_por_modelo.csv', index=False)

#print("Predições exportadas para 'predicoes_por_modelo.csv'")

In [5]:
# Exportar os resultados para um arquivo CSV
preprocessing.exportar_resultados_para_csv(resultados_finais, NOME_ARQUIVO_RESULTADOS)

## 5. Treinamento e Avaliação de Modelos de Ensemble

In [6]:

# Executa a Avaliação dos Modelos BERT e Similar
datasets = datasetsCarregados

# Executa o método que realiza o treinamento e teste encima dos datasets, utilizando diversos modelos'
resultados_finais, predicoes_por_modelo = ensemble_models.avaliar_modelosCombinados_em_datasets(datasets, VERSAO_NOME_BERT)




Analisando Dataset 1: jirasoftware.csv

KFold file already exists for dataset jirasoftware.csv. Skipping creation.
KFold indices loaded from file for dataset jirasoftware.csv.

Model: Linear Regression

Model: Ridge Regression

Model: Lasso Regression


  pearson_corr, _ = pearsonr(y_test, y_pred)
  pearson_corr, _ = pearsonr(y_test, y_pred)
  pearson_corr, _ = pearsonr(y_test, y_pred)



Model: ElasticNet Regression

Model: SVR

Model: KNN

Model: Decision Tree

Model: Random Forest

Model: XGBoost

Model: LightGBM
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000216 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 568
[LightGBM] [Info] Number of data points in the train set: 149, number of used features: 45
[LightGBM] [Info] Start training from score 4.181208
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000193 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 603
[LightGBM] [Info] Number of data points in the train set: 149, number of used features: 48
[LightGBM] [Info] Start training from score 3.778523
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the o

In [None]:
## Lista para armazenar os dados
#predicoes_lista = []

#for model_name, data in predicoes_por_modelo.items():
#    for issuekey, description, y_test, y_pred in zip(data['issuekeys'], data['descriptions'], data['y_test'], data['y_pred']):
#        predicoes_lista.append({
#            'Modelo': model_name,
#            'IssueKey': issuekey,
#            'Descrição': description,
#            'Valor Real': y_test,
#            'Valor Predito': y_pred
#        })

## Converter a lista em um DataFrame do pandas
#df_predicoes = pd.DataFrame(predicoes_lista)

## Exportar para CSV
#df_predicoes.to_csv('predicoes_por_modelo.csv', index=False)

#print("Predições exportadas para 'predicoes_por_modelo.csv'")

In [4]:
# Exportar os resultados para um arquivo CSV
preprocessing.exportar_resultados_para_csv(resultados_finais, NOME_ARQUIVO_RESULTADOS)

## 6. Conclusões


Nesta seção, discutimos os resultados obtidos com cada modelo. Observa-se que:
- [Inclua observações sobre o desempenho dos modelos, pontos fortes e limitações.]
- [Sugira possíveis próximos passos, como otimizações adicionais ou experimentos com novos dados.]
