This project involves analysing data that has been obtained from a Telco operating in Italy.
The main data analysis tools are GraphLab and Pandas; GraphLab is a distributed data analysis framework and also can be used for machine learning applications.
In this implementation it has proven to be quite efficient in conducting high level aggregations especially where the dataset is large and distributed over a distributed storage system. Pandas on the otherhand is quite efficient in manipulating in-memory computations and contains more fine grained APIs.