# Consolidate commits
---

We have 2 csv files here:
  1. ant_commits.csv
  2. ant_tags.csv

The goal of the rest of the code is combine the above two files to create a consolidated list of commits and release dates as seperate csv files. 

Each of the files will pertain to one release, i.e., all the commits that have been made post that release and prior to the next release.

### Imports

In [1]:
from __future__ import print_function
import pandas as pd
from pdb import set_trace
import warnings
from datetime import datetime
import dateutil
warnings.filterwarnings("ignore")

### Read the csv files

1. The date is raw, so read it and add column labels.
2. Convert column 1 (which is a string) to a datetime format.
3. Sort data chronologically


In [2]:
releases = pd.read_csv("ant_tags.csv", delimiter="  ", header=None)
releases.columns = ["Timestamp", "Commit_ID", "Version"]
all_commits = pd.read_csv("ant_commits.txt", delimiter="___", header=None)
all_commits.columns = ["Timestamp", "Commit ID", "Commit Message"]

In [3]:
"Formate Datetime"
releases["Timestamp"] = releases["Timestamp"].apply(lambda x: dateutil.parser.parse(x))
all_commits["Timestamp"] = all_commits["Timestamp"].apply(lambda x: dateutil.parser.parse(x))

In [4]:
"Sort data chronologically"
releases = releases.sort_values(by="Timestamp").reset_index(drop=True)
all_commits = all_commits.sort_values(by="Timestamp").reset_index(drop=True)

### Get release and commit dates

In [7]:
commits_made = dict()
for current_release_date, next_release_date, current_release_version in zip(releases["Timestamp"][:-1], releases["Timestamp"][1:], releases["Version"][:-1]):
    commits_made.update({current_release_version: list()})
    for index, commit in all_commits.iterrows():
        if current_release_date <= commit["Timestamp"] < next_release_date:
            print( current_release_date, next_release_date, commit["Timestamp"])            

2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-18 07:50:21
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-18 07:50:21
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-19 09:16:18
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-19 12:35:12
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-19 13:00:46
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-19 13:02:42
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-19 13:29:59
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-19 15:12:36
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-19 15:26:19
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-19 16:00:53
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-20 07:35:56
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-20 10:02:05
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-20 12:50:33
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-20 13:25:50
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-21 09:43:15
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-07-21 10:19:23
2000-07-18 07:50:21 2000-10-24 08:45:24 

2000-07-18 07:50:21 2000-10-24 08:45:24 2000-09-29 15:51:13
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-02 13:52:08
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-04 04:35:34
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-04 09:18:48
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-04 09:29:17
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-04 09:46:04
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-04 09:57:55
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-04 10:42:00
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-05 07:48:37
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-05 07:58:06
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-05 09:12:07
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-05 12:14:46
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-05 17:19:57
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-06 07:27:33
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-06 07:32:49
2000-07-18 07:50:21 2000-10-24 08:45:24 2000-10-06 07:35:54
2000-07-18 07:50:21 2000-10-24 08:45:24 

KeyboardInterrupt: 

In [55]:
%%bash

atom ant_commits.txt