## Notebook to calculate votes creation trends for tag

### Description
This notebook calculate basic trends data like tag rank based on the number of up-votes up until certain date.

### Input
This notebook takes as an input `posts_tag.csv` and `votes.csv` file, produced by the previous step.

### Output
As an output this notebook produces `votes_trends.csv` file with the following format:
```
Tag,CreationDate,TagPostsCreated,TagTotalPostsCreated,PostsCreated,TotalPostsCreated,TagPostsShare,TagRank
{tag},{creation-date},{tag-posts-created},{tag-total-posts-created},{posts-created},{tag-posts-share},{tag-rank}
```

where:
- `{tag}` - single tag related to a post. For instance: `c#`
- `{creation-date}` - post creation in 'YYYY-MM' format. For example: '2008-07'
- `{tag-votes-created}` - the number of posts created with that tag and at `{creation-date}`;
- `{tag-total-votes-created}` - the cumulative number of votes created with that tag from the beginning up until `{creation-date}`;
- `{votes-created}` - the number of all votes created at `{creation-date}`;
- `{total-votes-created}` - the cumulative number of posts created from the beginning up until `{creation-date}`;
- `{tag-posts-share}` - the percentage of posts created with that tag comparing to all posts. Calculated as `{tag-total-votes-created} / {total-votes-created} * 100`;
- `{tag-rank}` - the rank of the `{tag}` based on `{tag-votes-share}` in comparison to other tags at the same `{creation-date}`;

For example:
```csv
Tag,CreationDate,TagPostsCreated,TagTotalPostsCreated,PostsCreated,TotalPostsCreated,TagPostsShare,TagRank
```

In [None]:
import pandas as pd
from config import get_file_path

#### Load data, show shape and sample

In [None]:
posts_tag_file_path = get_file_path("posts_tag.csv")
posts_tag_df = pd.read_csv(posts_tag_file_path)
posts_tag_df

In [None]:
votes_file_path = get_file_path("votes.csv")
votes_df = pd.read_csv(votes_file_path)
votes_df

In [None]:
vote_creation_trends_df = pd.merge(
    posts_tag_df,
    votes_df,
    left_on='Id',
    right_on='PostId',
    how='inner',
    suffixes=('_Post', '_Vote')
)
vote_creation_trends_df