# Developers' Software Metrics Analysis

Projects used in this analysis:
 * [__Elasticsearch__](https://github.com/elastic/elasticsearch)
 * [__okhttp__](https://github.com/square/okhttp)
 * [__signal-android__](https://github.com/signalapp/Signal-Android)
 * [__bazel__]()
 * [__guava__]()
 * [__netty__]()
 * [__presto__]()
 * [__rxjava__]()
 * [__spring-boot__]()
 
 <font color='red'>__Note__: For each project, doesn't have all commits. Have commits with __SM__ missing due to configurations commits.</font>

## Getting started

Importing needed libs:

In [1]:
import os
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import json
from tqdm import tqdm

In [4]:
projects_list = os.listdir('/home/macdowell/pesquisa/developer_analysis/developers_data/bug_reports_validated/')
for project in projects_list:
    
    datapath = '/home/macdowell/pesquisa/developer_analysis/developers_data/bug_reports_validated/' + project + '/metrics_' + project + '.csv'
    
    # Reading developer metrics:
    df = pd.read_csv(datapath)

    # Dropping all columns less HASH, BUGGY and NATURE:
    df.drop(df.columns.difference(['hash','buggy','nature']), 1, inplace=True)

    # Getting list of nature:
    natures = df.nature.unique()
    df_by_nature = []

    # Filtering by commit nature:
    for nature in natures:
        print(nature + " Analysis: ")
        method_result = None
        class_result = None
        first = True

        sm_by_nature = df[df['nature'].str.contains(nature)]

        hash_len = len(sm_by_nature['hash'])

        # Iterating over each nature commits':
        for h in tqdm(sm_by_nature['hash']):
            # Checking if commits exists:
            if os.path.isfile('/home/macdowell/pesquisa/SM_data/okhttp/' + h + '.csv'):

                sm_data = pd.read_csv('/home/macdowell/pesquisa/SM_data/okhttp/' + h + '.csv')

                # Filtering by methods and classes SM:
                method_sm_data = sm_data[sm_data['Kind'].str.contains('ethod')]
                class_sm_data = sm_data[sm_data['Kind'].str.contains('lass')]

                # Dropping string columns:
                method_sm_data = method_sm_data.drop(['Kind', 'Name', 'File'], axis=1)
                class_sm_data = class_sm_data.drop(['Kind', 'Name', 'File'], axis=1)

                class_avg = class_sm_data.sum()/len(class_sm_data)
                method_avg = method_sm_data.sum()/len(method_sm_data)

                if first:
                    class_result = class_avg
                    method_result = method_avg
                    first = False
                else:
                    class_result += class_avg
                    method_result += method_avg

        class_result = class_result / hash_len
        method_result = method_result / hash_len
        print(class_result)
        # TODO: Output metric results with to_csv.
        break



  0%|          | 0/310 [00:00<?, ?it/s][A
  1%|          | 2/310 [00:00<00:23, 12.85it/s][A

Corrective Engineering Analysis: 



  1%|▏         | 4/310 [00:00<00:23, 12.96it/s][A
  2%|▏         | 6/310 [00:00<00:22, 13.62it/s][A
  3%|▎         | 8/310 [00:00<00:20, 14.42it/s][A
  3%|▎         | 10/310 [00:00<00:30,  9.68it/s][A
  4%|▍         | 12/310 [00:01<00:27, 10.81it/s][A
  5%|▍         | 14/310 [00:01<00:24, 12.29it/s][A
  5%|▌         | 16/310 [00:01<00:23, 12.36it/s][A
  6%|▌         | 18/310 [00:01<00:23, 12.61it/s][A
  6%|▋         | 20/310 [00:01<00:23, 12.45it/s][A
  7%|▋         | 22/310 [00:01<00:22, 12.60it/s][A
  8%|▊         | 24/310 [00:02<00:31,  9.23it/s][A
  8%|▊         | 26/310 [00:02<00:28,  9.89it/s][A
  9%|▉         | 28/310 [00:02<00:26, 10.78it/s][A
 10%|▉         | 30/310 [00:02<00:23, 12.10it/s][A
 10%|█         | 32/310 [00:02<00:21, 12.85it/s][A
 11%|█         | 34/310 [00:02<00:21, 12.96it/s][A
 12%|█▏        | 36/310 [00:03<00:21, 12.57it/s][A
 12%|█▏        | 38/310 [00:03<00:29,  9.22it/s][A
 13%|█▎        | 40/310 [00:03<00:26, 10.10it/s][A
 14%|█▍       

KeyboardInterrupt: 