# Rabbit with ghmap
This notebook is a test notebook to extract activities of a GH user using the `ghmap` package and then extract features used by the `rabbit` package.

In [1]:
from src.bimbas_ghmap import query_events, events_to_activities, activity_to_df, extract_features

import important_features as rabbit_features

# 1 - Extracting activities

## 1.1 - Setup the variables
We need the api key and the user name to extract the activities.

In [2]:
API_KEY = None # Set KEY
USER = 'robodoo'

## 1.2 - Get the raw events from the user
By default, `ghmap` needs raw github events to extract the activities.

In [3]:
raw_events = query_events(USER, API_KEY)
raw_events[-1]

{'id': '45325080089',
 'type': 'PushEvent',
 'actor': {'id': 16837285,
  'login': 'robodoo',
  'display_login': 'robodoo',
  'gravatar_id': '',
  'url': 'https://api.github.com/users/robodoo',
  'avatar_url': 'https://avatars.githubusercontent.com/u/16837285?'},
 'repo': {'id': 362812569,
  'name': 'odoo/design-themes',
  'url': 'https://api.github.com/repos/odoo/design-themes'},
 'payload': {'repository_id': 362812569,
  'push_id': 22005571039,
  'size': 1,
  'distinct_size': 1,
  'ref': 'refs/heads/staging.saas-17.4',
  'head': 'fd82a5d0ae99d3dbc20b5ec3102fa7a65538c92d',
  'before': '7e723f257eeff9f97f3fa142d9961a53d735698d',
  'commits': [{'sha': 'fd82a5d0ae99d3dbc20b5ec3102fa7a65538c92d',
    'author': {'email': 'robodoo@odoo.com', 'name': "Odoo's Mergebot"},
    'message': 'force rebuild\n\nuniquifier: nRnFriC1VpIMoiqE\nFor-Commit-Id: b36b9076cc6e63442a3c8952ce0c6d8f7810602e',
    'distinct': True,
    'url': 'https://api.github.com/repos/odoo/design-themes/commits/fd82a5d0ae99d3d

## 1.3 - Extract the activities
We can now use ghmap to extract the activities from the raw events.

In [4]:
activities = events_to_activities(raw_events)

Mapping events to actions: 100%|██████████| 232/232 [00:00<00:00, 15689.50event/s]
Mapping actions to activities: 100%|██████████| 7/7 [00:00<00:00, 430.51group/s]


In [5]:
print(activities[-1])

{'activity': 'PushCommits', 'start_date': '2025-01-07T14:29:34Z', 'end_date': '2025-01-07T14:29:34Z', 'actor': {'id': 16837285, 'login': 'robodoo'}, 'repository': {'id': 19745004, 'name': 'odoo/odoo', 'organisation': 'odoo', 'organisation_id': 6368483}, 'actions': [{'action': 'PushCommits', 'event_id': '45331739632', 'date': '2025-01-07T14:29:34Z', 'details': {'push': {'id': 22008910634, 'ref': 'refs/heads/staging.17.0', 'commits': 3}}}]}


# 2 - Extracting features
Now, we can extract the features used by BIMBAS model.
The features are devided in 2 groups :
- **a** : Counting metrics
- **b** : Aggregated metrics (mean, std, median, IQR, gini)



## 2.1 - Convert to a DataFrame compatible with RABBIT
RABBIT needs a DataFrame with the columns 'date', 'activity', 'contributor' and 'repository'.


In [6]:
df = activity_to_df(activities)
display(df)

Unnamed: 0,date,activity,contributor,repository
0,2025-01-07 11:09:02,PushCommits,robodoo,odoo/design-themes
1,2025-01-07 11:09:04,PushCommits,robodoo,odoo/documentation
2,2025-01-07 11:09:15,CommentPullRequest,robodoo,odoo/odoo
3,2025-01-07 11:09:32,CommentPullRequest,robodoo,odoo/odoo
4,2025-01-07 11:09:46,CommentPullRequest,robodoo,odoo/odoo
...,...,...,...,...
205,2025-01-07 14:28:59,PushCommits,robodoo,odoo/odoo
206,2025-01-07 14:29:07,CommentPullRequest,robodoo,odoo/odoo
207,2025-01-07 14:29:32,PushCommits,robodoo,odoo/design-themes
208,2025-01-07 14:29:32,PushCommits,robodoo,odoo/documentation


## 2.2 - Extract the features
Since we have the DataFrame, we can now extract the features using RABBIT extractor.

In [7]:
df_feat = extract_features(df, USER)
display(df_feat)

Unnamed: 0,NA,NT,NOR,ORR,DCA_mean,DCA_median,DCA_std,DCA_gini,DCA_IQR,NAR_mean,...,DCAT_mean,DCAT_median,DCAT_std,DCAT_gini,DCAT_IQR,NAT_mean,NAT_median,NAT_std,NAT_gini,NAT_IQR
robodoo,210,3,2,0.286,0.016,0.002,0.037,0.791,0.007,30.0,...,0.015,0.002,0.033,0.775,0.01,70.0,74.0,15.395,0.095,15.0


# 3 - Predict if the user is a bot or not
We can now use the BIMBAS model to predict if the user is a bot or not.

In [8]:
from rabbit import get_model, compute_confidence
import warnings

model = get_model()
with warnings.catch_warnings():
    warnings.simplefilter("ignore", category=UserWarning)
    proba = model.predict_proba(df_feat)
contributor_type, confidence = compute_confidence(proba[0][1])
print(f"{USER} is a {contributor_type} with a confidence of {confidence:}")


robodoo is a Bot with a confidence of 0.936
