# Code Pie Graph

The Pie Graph is a visualization of the coding languages used in a given directory. A list of programming languages and their extensions is parsed from a [Wikipedia Article](https://en.wikipedia.org/wiki/List_of_programming_languages). A `.json` file with this Data is created in the `Data` directory and is used to identify the languages used. Comments are included when scanning files, so the results may look different than those on Github. To generate a Pie Chart for any directory, copy it into this folder and run the script.

As an example, here is the Pie Chart for my [UW-Course-Tool](https://github.com/AlexEidt/UW-Course-Tool) repository.

In [1]:
import os, re, json, requests
import pandas as pd
from itertools import cycle
from collections import Counter
from bs4 import BeautifulSoup
from matplotlib import pyplot as plt

In [2]:
def find_languages(dirname, code_dict, counter):
    """Recursively Analyzes all files/directories for the given dirname
    @params
        'dirname': Directory to search
        'code_dict': Dictionary with Programming Language Names to extensions mappings
        'counter': Count of characters for each programming language found in the directory
    """
    if not os.path.isdir(dirname):
        if '.md' not in dirname: # Filter out MarkDown files
            ext = '.{}'.format(dirname.rsplit('.', 1)[-1])
            for lang, extns in code_dict.items():
                if ext in extns:
                    if lang not in counter:
                        counter[lang] = 0
                    counter[lang] += int(os.stat(dirname).st_size)
    else:
        for f in os.listdir(dirname):
            file_name, ext = os.path.splitext(f)
            find_languages(os.path.normpath(f'{dirname}/{file_name}{ext}'), code_dict, counter)

In [4]:
dir_name = input('Enter Directory to Scan. Must be complete file path: ')
directory = os.path.normpath(dir_name)
directory_name = directory.rsplit('\\', 1)[-1]
with open(os.path.normpath(f'{os.getcwd()}/Data/progamming_languages.json'), mode='r') as file:
    code_dict = json.loads(file.read())
counter = {}
find_languages(directory, code_dict, counter)
lang_freq = {'Language': list(counter.keys()), 'Frequency': list(counter.values())}
df = pd.DataFrame(lang_freq)
df.insert(2, 'Explode', df['Frequency'][df['Frequency'] == df['Frequency'].max()])
df = df.fillna(0)
index = df.loc[df['Explode'] != 0].index # Index of Highest Frequency Language in Explode column
df['Explode'][index] = 0.1

# Create Pie Graph
plt.style.use('fivethirtyeight')

plt.pie(df['Frequency'], labels=df['Language'], explode=df['Explode'],
        wedgeprops={'edgecolor': 'black', 'linewidth': 2},
        startangle=90, shadow=True, autopct='%1.0f%%',
        rotatelabels=False)

plt.title(f'Code used in {directory_name}')
plt.figtext(0.90, 0.05, 
            '*Comments included\n*Markup Languages not Included\n(except HTML, CSS)', 
            horizontalalignment='right', verticalalignment='bottom', fontsize='xx-small')

plt.tight_layout()

Enter Directory to Scan. Must be complete file path: C:\Users\alex\Documents\UWCourseWeb


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  del sys.path[0]
