# Example weightGIS

In this case let's use a working example of how this whole process may be undertaken. Let's take a fictional island 
nation that has three core regions where one of the three expands its borders sometime between 1931-51. We also for this
example have the underlying city administrative regions from 1921 with the population of these regions, so we have the
ability to do area and population weighting. 

In [None]:
from IPython.display import Image
Image(url= "https://github.com/sbaker-dev/weightGIS/Example/Images/ExampleChanges.png")

## Construct base weights

If you want to follow along all the example data is in a folder on the [github page][repo] under ExampleData. This will 
include all the results as well as the raw data you need to follow along. To start with we need to see when changes 
occur, to do this we need to compare a set of shapefiles and see if a polygon stays the same in the next iteration of 
that place in time. You need to provide a working directory with a folder called "Shapefiles" with all the shapefiles
you want to compare in; or later tell ConstructWeights the name of the folder by setting the keyword arg 
shape_file_folder_name.

You also need to put a population shapefile in the project directory, not the shapefile directory, if you want to do
population sub weighting and set a weight index for the base zero column index that holds the population information. We
will undertake this process in this example. IN this case most of the indexes are set to there default values but 
remember to set them, like weight index has been set for our subunit attribute column index, if your attributes are not
the default values. 

[repo]: https://github.com/sbaker-dev/weightGIS/Example/ExampleData

In [None]:
from weightGIS import ConstructWeights

project_directory = r"ASSIGN A PATH HERE"
base_shape = "1951.shp"
population_shape = "1921.shp"

ConstructWeights(project_directory, base_shape, population_shape, weight_index=2).construct_base_weights()

As you can see whilst Ecanlor does gain a large amount of Danlhigh, it represents mostly open mountains and grass land 
so the actual population that have been re-assigned is drastically different as few people lived in these rural areas. 
This example has been constructed to be an extreme case, but should allow you to see how if the area's that are 
transferred are large, but not with an equivalently large amount of population living in it, that area weights may be a 
poor choice. In general, the larger the geographical generalisation you use, the more dangerous area weights become.

## Determine Changes

Now we know that a change occurs between Ecanlor and we have the weight change, but we don't know exactly when it
occurs, we simply know it occurs between this period. To assign this value to a date we need to write out all the 
changes for the user which can be done via the write out changes command. In this case you just need to provide the path
to file you just created. We can just save the information to our working directory. Given in this case our unit's did
not have a class, then we need to set name_class to be False.

In [None]:
from weightGIS.AssignWeights import AssignWeights

AssignWeights("BaseWeights_0.txt", project_directory, "ChangeLog").write_out_changes()


As we can see both Ecanlor and Dalhigh both experience a change so we now need to go and find out when this occurs. This
is an important time to bring up another limitation of *observable changes*. Let's say we dig through the archives and 
find that during 1931 and 1951 there we actually two changes, even though we where only expecting 1. When we use 
shapefiles we only see the cumulative effect off all the changes, and can only act upon these observed changes. Clearly,
if you can find mapping information on these individual changes then its possible to correct this by drawing new 
shapefiles, but this is unlikely to be possible for larger projects on time budgets alone, let alone practically or data
limitations. 

## Determine when the changes occur

So, lets say the first change happened in 1938, and then we have another change in 1939. The change that reflects the new
shape we observed in 1939 occurs in 1939, so that is the date that will be assigned. Whether you want to go out of your
way to record the changes that will not be used or not is up to you, although it can be important for transparency so it
is recommend. This means that you will produce a file that looks as like the following, which can be seen in the 
Weight_Dates.csv

| GID         | Place Name | Changes1    | Changes2    |
|:------------|:-----      |:-----       |:-----       |
| 1           | Ecanlor    | 01/04/1938  | 01/04/1939  |
| 2           | Nirghol    | -           | -           |
| 3           | Danlhigh   | 01/04/1938  | 01/04/1939  |

## Constructed weighted Database

Then we want to take these weights and construct a database that has the weights relative to the dates that places
change over time in. First we need to load the weights be generated in the ConstructWeights. By default the system will
look for a file called Weight_Dates.csv but if you call it something else you will need to update the dates_name keyword
argument within AssignWeights. It may be the case that you only observe the dates of a census in a general year format, 
but the changes you have are more specific in terms of year-month-day. If this is the case, you need to adjust the year
format by assigning a month and day to the assign_weights call method so we can look at changes occurring between them.
Finally provide a write directory and name, and then your finished!

In [None]:
from weightGIS.AssignWeights import AssignWeights

AssignWeights("BaseWeights_0.txt", project_directory, "1951_weights_by_dates").assign_weights_dates("0401")

Image(url= "https://github.com/sbaker-dev/weightGIS/Example/Images/jsonView.png")


This will write out a json database for each ID and place showing when the changes occur starting from the first census
year provided. As we can see in the json data below, each place in our reference shape that now has dates assigned to 
each change, and the places involved in that change in the form of change place id, Change place name, and a given 
weight that was specified.

```json
{
    "1__Ecanlor": {
        "19310401": {
            "1__Ecanlor": 100.0,
            "3__Danlhigh": 1.8336986193489935
        },
        "19390401": {
            "1__Ecanlor": 100.0
        }
    },
    "2__Nirghol": {
        "19310401": {
            "2__Nirghol": 100.0
        }
    },
    "3__Danlhigh": {
        "19310401": {
            "3__Danlhigh": 98.16443083327889
        },
        "19390401": {
            "3__Danlhigh": 100.0
        }
    }
}

```

Now that we have these weights, we can now weight time-series data from these regions using just one of the shapefiles.
In this case we are going to weight some example in two ways. If you have a very large complex data set or are trying to 
merge multiple data-sets then it is recommend you create a json database that represents your data. If you are 
unfamiliar with JSON and want to use a csv, then you can do so as long as you don't mix geo levels within the same 
document. Places need to be in the first column, with dates running across the top. Dates without data should be left
blank. If you are using a json data structure, it needs to take this format:
 
```json
{
    "Geo-level1": {
        "PlaceName": {
            "AttributeA": {
                "Dates": [],
                "Values": []
            },
            "AttributeB": {
                "Dates": [],
                "Values": []
            },
            "AttributeC": {
                "Dates": [],
                "Values": []
            }
        }
    },
    "Geo-level2": {
        "PlaceName": {
            "AttributeA": {
                "Dates": [],
                "Values": []
            }
        },
        "PlaceName2": {
            "AttributeA": {
                "Dates": [],
                "Values": []
            }
        }
    }
}

```

You need to structure your data in a manner that is geo-level specific. For example if you have state level and district
level data they both need their own group. Even if you only have 1 level, you **must** group the data by that level.
From there you need to assign unique places, that have unique attributes otherwise they will override each other due to
how json works. Place names need be ID__NAME in that format specifically. You can have additional information in the 
name, and ID isn't very human readable which goes against the principle of json, but you must make sure that the ID
is the first element as it will be extract by .split("__")[0] with the rest of the information discarded. If you want
the place name to be assigned, make sure to have an attribute within the PlaceName that is assigned the name. 


# Adjacent Polygons

It is also it extract an adjacent polygon relation via AdjacentRelations. This looks for common points in shapes within
your shapefile and then creates a link json file based on the record index you want.