# Visualize Directory
- Following thread "http://www.austintaylor.io/d3/python/pandas/2016/02/01/create-d3-chart-python-force-directed/"

## The Network Structure
- A dictionary with two lists, nodes and links.
- links contains the relationships between nodes
- nodes contains each individual node

```json
{
  "nodes":  [
    { "name": "desktop", "group":  1},
    { "name": "desktop/apples.txt", "group":  1},
    { "name": "desktop/pineapple/apples.txt", "group":  1},
    { "name": "desktop/bananas.txt", "group":  1}
  ],

  "links":  [
    { "source":  1,  "target":  0,  "value":  5555 },
    { "source":  2,  "target":  0,  "value":  1 },
    { "source":  3,  "target":  0,  "value": 1 }
  ]
}
```

## Setup

### Modules

In [1]:
import os
import pandas
import json

### Set path of directory you wish to visualize

In [2]:
path = '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/'
export_path = "/users/danielcorcoran/desktop/github_repos/python_nb_network/"

### Set group node option

In [3]:
set_groups_to_file_types = True

## Collect Data

### Create list to store all the absolute paths within the path directory, this will be used to branch out relationships

In [4]:
absolute_paths = []

In [5]:
for dirpath, dirnames, filenames in os.walk(path):

    #print(dirpath, dirnames, filenames)

    for dirname in dirnames:
        x = dirpath + "/" + dirname
        absolute_paths.append(x.strip().replace("//",
                                                "/").replace("'", "").replace(
                                                    '"', ''))

    for filename in filenames:
        y = dirpath + "/" + filename
        absolute_paths.append(y.strip().replace("//",
                                                "/").replace("'", "").replace(
                                                    '"', ''))

### Store data in pandas dataframe

In [6]:
data = pandas.DataFrame(absolute_paths)
data.rename({0: "absolute_path"}, axis=1, inplace=True)
data.head(15)

Unnamed: 0,absolute_path
0,/Users/danielcorcoran/desktop/github_repos/pyt...
1,/Users/danielcorcoran/desktop/github_repos/pyt...
2,/Users/danielcorcoran/desktop/github_repos/pyt...
3,/Users/danielcorcoran/desktop/github_repos/pyt...
4,/Users/danielcorcoran/desktop/github_repos/pyt...
5,/Users/danielcorcoran/desktop/github_repos/pyt...
6,/Users/danielcorcoran/desktop/github_repos/pyt...
7,/Users/danielcorcoran/desktop/github_repos/pyt...
8,/Users/danielcorcoran/desktop/github_repos/pyt...
9,/Users/danielcorcoran/desktop/github_repos/pyt...


### Create two columns, destination and source

In [7]:
for index in range(data.shape[0]):
    item = data.iloc[index, 0]
    split = item.split("/")
    
#     destination = split[len(split)-1]
#     source = split[len(split)-2]

    source = ("/").join(split[: len(split)-1])
    destination = ("/").join(split)
    
    data.loc[index, "destination"] = destination
    data.loc[index, "source"] = source

In [8]:
data.head()

Unnamed: 0,absolute_path,destination,source
0,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
1,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
2,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
3,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
4,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...


In [9]:
data

Unnamed: 0,absolute_path,destination,source
0,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
1,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
2,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
3,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
4,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
5,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
6,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
7,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
8,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...
9,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...,/Users/danielcorcoran/desktop/github_repos/pyt...


### Create groups based on file type (optional)

In [10]:
for index in range(data.shape[0]):
    
    absolute_path = data.loc[index, "absolute_path"]
    
    last_item = absolute_path.split("/")[-1] 
    
    if "." in last_item:
        data.loc[index, "file_extension"] = "." + last_item.split(".")[-1]
    else:
        data.loc[index, "file_extension"] = "folder"

In [11]:
unique_extensions = list(data["file_extension"].unique())
unique_extensions

['folder',
 '.ipynb_checkpoints',
 '.git',
 '.ipynb',
 '.DS_Store',
 '.md',
 '.log',
 '.gitignore',
 '.sample',
 '.shx',
 '.xml',
 '.shp',
 '.dbf',
 '.prj']

### Create a list containing only destinations, this will be used to build the nodes list as part of the main dictionary

In [12]:
destination_list = list(data["destination"])
destination_list

['/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/drivers',
 '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/.ipynb_checkpoints',
 '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/.git',
 '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/data',
 '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/notebook_calculate_closest_street_pool.ipynb',
 '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/.DS_Store',
 '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/notebook_reverse_geocode.ipynb',
 '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/README.md',
 '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/notebook_calculate_closest_street.ipynb',
 '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/geckodriver.log',
 '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/.gitignore',
 '/Users/danielcorcoran/deskto

In [13]:
destination_list.append(path)

### Create nodes_list

In [14]:
nodes_list = []

for index in range(len(destination_list)):

    if index == len(destination_list) - 1:
        group = 999999999999999999
    else:
        if set_groups_to_file_types == True:
            group_text = data.loc[index, "file_extension"]
            group_index = unique_extensions.index(group_text)
        else:
            group_index = 1

    nodes_list.append({"group": group_index, "name": destination_list[index]})

In [15]:
nodes_list

[{'group': 0,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/drivers'},
 {'group': 1,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/.ipynb_checkpoints'},
 {'group': 2,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/.git'},
 {'group': 0,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/data'},
 {'group': 3,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/notebook_calculate_closest_street_pool.ipynb'},
 {'group': 4,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/.DS_Store'},
 {'group': 3,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/notebook_reverse_geocode.ipynb'},
 {'group': 5,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/README.md'},
 {'group': 3,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/notebook_calculate_close

In [16]:
nodes_list[:3]

[{'group': 0,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/drivers'},
 {'group': 1,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/.ipynb_checkpoints'},
 {'group': 2,
  'name': '/Users/danielcorcoran/desktop/github_repos/python_nb_data_spatial/.git'}]

### Next, build up the second part of the dictionary, the links list

In [17]:
links_list = []

In [18]:
for index in range(data.shape[0]):

    try:

        target = index

        source_text = data.loc[index, "source"]

        source = destination_list.index(source_text)

        links_list.append({"source": source, "target": target, "value": 1})
    except:

        print(index, ' has failed, attempting alternative method')

        target = index

        source = len(destination_list) - 1

        links_list.append({"source": source, "target": target, "value": 1})

0  has failed, attempting alternative method
1  has failed, attempting alternative method
2  has failed, attempting alternative method
3  has failed, attempting alternative method
4  has failed, attempting alternative method
5  has failed, attempting alternative method
6  has failed, attempting alternative method
7  has failed, attempting alternative method
8  has failed, attempting alternative method
9  has failed, attempting alternative method
10  has failed, attempting alternative method
11  has failed, attempting alternative method
12  has failed, attempting alternative method


## Process final data

### Merge nodes and links lists into one dictionary

In [19]:
json_data = {"nodes": nodes_list, "links": links_list}

### Convert python dictionary to json string

In [20]:
json_dump = json.dumps(json_data, indent=1, sort_keys=True)

### Export to filename 'pcap_export.json' to be used in index.html

In [21]:
json_out = open(export_path + "pcap_export.json", "w")
json_out.write(json_dump)
json_out.close()