# Mapping multiple organizations with getorg
This generates the data that is used for [this markercluster leaflet HTML/Javascript page](http://getorg.github.io/jupyter_map/), which runs entirely on the viewer's web browser. 
## Installing and importing libraries
We need a few packages for this, which can be installed with pip:

In [1]:
!pip install getorg --upgrade
!pip install ipyleaflet
!jupyter nbextension enable --py ipyleaflet

Unrecognized JSON config file version, assuming version 1
Enabling notebook extension jupyter-leaflet/extension...
      - Validating: [32mOK[0m


To access the Github API, we use an object provided by pygithub. To make more than 60 queries per hour, you need to log in with your Github account. You can do this with your username and password by:

    gh = Github(login_or_token=your_username, password=your_password).
However, the better way is to [get an API token](https://github.com/settings/tokens) (you don't have to grant it any privileges to query public repositories) and pass this token. I've got mine stored in a file called ghlogin.py (which I have not uploaded to Github), which I import.

In [2]:
import getorg
from github import Github
import ghlogin
gh = Github(login_or_token=ghlogin.gh_key)

IPywidgets and ipyleaflet support enabled.


## Working with the orgmap module

With this getorg notebook, we are working with the orgmap module, so all the functions are called with a prefix of:

    getorg.orgmap.
    
The functions we will be using are:

    getorg.orgmap.map_orgs()
    getorg.orgmap.location_dict_to_csv()
    getorg.orgmap.location_dict_to_jsvar()
    
## The map_orgs function

### Parameters

The getorg.orgmap.map_orgs function takes in two to four parameters:
* github_obj: a Github API object, which we created when we logged in in the previous cell
* org_list_or_object: A list of strings containing the names of organizations to map, a string containing a single organization, or a Github organization object
* debug: A debug level (optional). 
  * 0 is silent, no output
  * 1 (default) prints the name of the org queried and one character per contributor queried: . for success, E for error
  * 2 (what we are calling here) prints everything in level 1, as well as the name of the repository queried
  * 3 prints everything in level 2, as well as the location of each contributor or the error raised
* exclude_usernames (optional): A list of strings containing usernames to exclude from the map and location dictionary
  
### Returned values

The map_orgs function returns three objects:
* org_map: an ipyleaflet Map object that will display inline in a Jupyter notebook (type: ipyleaflet.leaflet.Map)
* org_locations: a dictionary with the key value pairs { Github username URL : geopy Location object }
  * Note: geopy Location objects have many features. You can find the latitude and longitude in loc.latitude / loc.longitude, for example.
* org_metadata: a dictionary with four key value pairs:
  * user_loc_count: the number of unique usernames in the map and location dictionary
  * error_count: the number of times an error was thrown when querying a contributor's location
  * no_loc_count: the number of users who have no location set in their Github profile
  * duplicate_count: the number of times a username was found in multiple repositories 

### Example: querying Jupyter and IPython organizations
We first create a list containing strings of all the organizations. You can also just send a single organization, in either a list of one or just a string.

In [3]:
orgs = ["jupyter","ipython","jupyter-attic","jupyterhub"]

Then we call the method, with the github object, list of organizations, and optional debug and excluded username parameters. Note: it takes 20-30 minutes of 'wall time' to get all the contributor locations for each of these organizations.

In [4]:
org_map, org_locations, org_metadata = getorg.orgmap.map_orgs(gh, orgs, debug=2)


Querying organization Project Jupyter

Querying repository nbviewer
............................................
Querying repository nbconvert-examples
.....
Querying repository colaboratory
.............
Querying repository jupyter.github.io
...................
Querying repository design
.........
Querying repository nbcache
..
Querying repository nbgrader
............
Querying repository tmpnb
.....................
Querying repository nature-demo
......
Querying repository jupyter-drive
.........
Querying repository tmpnb-redirector
...
Querying repository tmpnb-deploy
...
Querying repository docker-demo-images
......................
Querying repository try.jupyter.org
...
Querying repository strata-sv-2015-tutorial
..
Querying repository testpath
.
Querying repository scipy-2015-advanced-topics
.
Querying repository jupyter_core
..............
Querying repository nbformat
........................
Querying repository jupyter_client
..................................................


In [5]:
org_metadata

{'duplicate_count': 588,
 'error_count': 0,
 'no_loc_count': 649,
 'user_loc_count': 726}

## Displaying the map
In a Jupyter notebook with ipyleaflet support, we can display the map by referring to the org_map object. Note that these do not always render well if you are not running a notebook on a Jupyter server (i.e. if you are viewing this on github or nbviewer), so I've added a screenshot image to show what it looks like.

In [6]:
org_map

Widget Javascript not detected.  It may not be installed properly. Did you enable the widgetsnbextension? If not, then run "jupyter nbextension enable --py --sys-prefix widgetsnbextension"


![jupyter map](jupyter-map.png)

You can export this to embedable HTML files with the Widgets->Embed Widgets code in the Jupyter notebook menu bar.

## Writing the map to a file
But what if you don't have ipyleaflet support or don't want to have to spend a long time querying in a Jupyter notebook every time you want to display the map? We can write the locations to a file with two getorg.orgmap methods. Then we can display these points using whatever mapping software we want.

In both of the following methods, I am setting the hashed_usernames option to True, which outputs a SHA1 hash of the username in the data file instead of the username. I'm doing this because I haven't fully thought out the privacy implications of putting people's usernames on a map, even if Github data is public and accessible. It is something to think about.
### location_dict_to_jsvar
This outputs to a single Javascript variable, for use with the markercluster HTML / Javascript file in this directory.


In [8]:
getorg.orgmap.location_dict_to_jsvar(org_locations, "jupyter-locations.js", hashed_usernames=True)

'Written to jupyter-locations.js'

### location_dict_to_csv
This is a similar syntax, but just outputs to a CSV file instead

In [9]:
getorg.orgmap.location_dict_to_csv(org_locations, "jupyter-locations.csv", hashed_usernames=True)

'Written to jupyter-locations.csv'

## Plotting data files using a markercluster HTML / Javascript page
We want a nice clustered map, so that we can represent the denser regions. Right now, the ~20 points at the center of Berkeley, California are all layered on top of each other. There is a [nice markercluster package](https://github.com/Leaflet/Leaflet.markercluster) that lets us do this, but just in standalone HTML files, not in Jupyter notebooks. So we are going to take that exported jsvar file with all the location data and import it into an HTML file running this markercluster script.

To set this up on a web server, it is best to just copy all the files in the examples/orgmap/ folder into something publicly-accessible, making sure to include the /leaflet_dist/ folder. The HTML file relies on a set of files in the /leaflet_dist/ folder to do the clustering. This also copies this Jupyter notebook, so you might want to remove it.

The HTML file has two parts: a header where all the scripts and stylesheets are imported, and a body that displays the map and any other text. 

In the head, the last script line points to the js file we just outputted. Change it to whatever you named your output file. You don't need to change anything else in the body if you want it to default to a zoomed out map of the world.

    <html>
    <head>
        <title>Maps</title>

        <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/leaflet/1.0.0-beta.2/leaflet.css" />
        <script src="https://cdnjs.cloudflare.com/ajax/libs/leaflet/1.0.0-beta.2/leaflet.js"></script>
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <link rel="stylesheet" href="leaflet_dist/screen.css" />

        <link rel="stylesheet" href="leaflet_dist/MarkerCluster.css" />
        <link rel="stylesheet" href="leaflet_dist/MarkerCluster.Default.css" />
        <script src="leaflet_dist/leaflet.markercluster-src.js"></script>
        <script src="jupyter-locations.js"></script>

    </head>
    
    <body>
    
    	<div id="map"></div>
        <script type="text/javascript">

		var tiles = L.tileLayer('http://server.arcgisonline.com/ArcGIS/rest/services/World_Street_Map/MapServer/tile/{z}/{y}/{x}', {
        maxZoom: 18,
        attribution: 'Tiles &copy; Esri &mdash; Source: Esri, DeLorme, NAVTEQ, USGS, Intermap, iPC, NRCAN, Esri Japan, METI, Esri China (Hong Kong), Esri (Thailand), TomTom, 2012'}),
		latlng = L.latLng(30, 10);

		var map = L.map('map', {center: latlng, zoom: 0.7, layers: [tiles]});

		var markers = L.markerClusterGroup({
			showCoverageOnHover: false,
			maxClusterRadius: 80
			});

		for (var i = 0; i < addressPoints.length; i++) {
			var a = addressPoints[i];
			var title = a[0];
			var marker = L.marker(new L.LatLng(a[1], a[2]), { title: title });
			marker.bindPopup(title);
			markers.addLayer(marker);
		}
		map.addLayer(markers);
		map.zoomIn();

	</script>
    </body>
    </html>



You can see the results in [this webpage](https://getorg.github.io/jupyter_map/), which I've screenshotted below:
![jupytermap](markercluster.png)