# Giga Tutorial
The examples below illustrate how the building blocks of the Giga modeling library can be used individually or in combination to perform a variety of modeling tasks for school connectivity assesment.

## Model Parameters
Let's first load in the parameters for a default project. We'll use the parameters defined in the google sheet [here](https://docs.google.com/spreadsheets/d/1LsOLtcZG8FO9uF79H7Z_PdN6iHkuMyr5TXw3UllbahE/edit?usp=sharing). Note that we pull the `docid` below from the url of the google sheet document.

If you are using your own spreadhseet to define the parameters, its documet ID can be read off the url of the document in the form:
`https://docs.google.com/document/d/<documentId>/edit`

In [1]:
import pandas as pd
from giga.utils.parse import fetch_all_params_from_gsheet


docid = "1LsOLtcZG8FO9uF79H7Z_PdN6iHkuMyr5TXw3UllbahE"
params = fetch_all_params_from_gsheet(docid)

Let's take a look at all the parameters we've loaded in.

In [2]:
from IPython.display import display
for k, v in params.items():
    print("\n", k, "parameters")
    display(v)


 project parameters


Unnamed: 0,Name,Value,Units
0,Student Teacher Ratio,30.0,
1,Teacher Classroom Ratio,1.2,
2,School Age Fraction,0.37,
3,School Enrollment Fraction,0.76,
4,People per Household,6.0,People
5,Skilled Labor Cost per Hour,20.0,USD
6,Regular Labor Cost per Hour,4.0,USD
7,Income per Household,6700.0,USD/Yr
8,Fraction of Income on Communications,0.02,
9,Subscription Conversion,0.07,



 usage parameters


Unnamed: 0,Name,Value,Units
0,Fixed Bandwidth Rate,10.0,Mbps
1,EMIS Allowable Transfer Time,4.0,Hours
2,Allowable Website Loading Time,20.0,Seconds
3,Allowable Document Loading Time,60.0,Seconds
4,Allowable Completed Assignments Loading Time,10.0,Seconds
5,Peak Hours,9.0,Hours
6,Size of Website,700.0,KB
7,Size of Document,200.0,KB
8,Google Docs Bandwidth,20.0,Kbps
9,Internet Browsing Bandwidth,1.0,Mbps



 assignment parameters


Unnamed: 0,Name,Value,Units
0,Student Prep Time,4.0,Hours
1,Teacher Research Time,0.25,Hours
2,Number of Daily Assignments Per Student,2.0,
3,Student Research Time,1.0,Hours
4,Student Assignments Time,2.0,Hours
5,Time to Grade One Assignment,5.0,Minutes



 community parameters


Unnamed: 0,Name,Value,Units
0,Fraction of Community Using School Internet,0.2,
1,Session Length,30.0,Minutes
2,Weekly Sessions,2.0,
3,Community Access Hours,8.0,Hours/Day
4,Contention,25.0,People/Slot



 lesson parameters


Unnamed: 0,Name,Value,Units
0,Weekly Planning Time,5.0,Hours
1,Fraction of Planning Time Browsing,0.2,



 telemedicine parameters


Unnamed: 0,Name,Value,Units
0,Annual Checkups,1.0,Per Year
1,Illness per Year,2.0,
2,Consults per Illness,3.0,
3,Consult time,0.17,Hours/Patient
4,Consult hours,2.0,Hours/Day



 model parameters


Unnamed: 0,Name,Value,Units
0,School Consolidation Radius,0.01,km
1,School Use Radius,10.0,km
2,Internet Use Radius,1.0,km
3,Revenue Over Cost Factor,5.0,



 emis parameters


Unnamed: 0,Name,Description,Size,Frequency,User,Type
0,Admin Enrollment,Size of enrollment data,500,12,School,EMIS
1,Admin Cohort,Clas by class cohorot data,100,4,Students,EMIS
2,Admin Behavioral,"Disciplinary recods, etc",100,12,Students,EMIS
3,Admin Special Needs,,100,12,Students,EMIS
4,Admin Indicators,,1000,4,School,EMIS
5,Admin Financial Data,High-level finances,10000,12,School,EMIS
6,Financial Budget,,10000,12,School,EMIS
7,Financial School Fees,,100,4,Students,EMIS
8,Financial Supply and Inventory,,1000,12,School,EMIS
9,HR Salaries,,10,4,Employees,EMIS



 portal parameters


Unnamed: 0,Name,Description,Size,Usage Pattern,Time Period (Days),User,Type
0,Voter Registration,,1200,Clustered,20,Individuals,Portal
1,ID Renewal,,800,Uniform,365,Individuals,Portal
2,Annual Taxes,,4500,Clustered,5,Households,Portal
3,Bill Payments,,500,Uniform,2,Households,Portal
4,Complaints and Reporting,,500,Uniform,365,Individuals,Portal
5,e-Petitions,,500,Uniform,60,Individuals,Portal



 connectivity parameters


Unnamed: 0,Type,Speed,Overnight Hardware Fixed,Overnight Labor Fixed,Setup Fees,Annual Hardware,Annual Labor Time,Annual Fees,Power,Overnight Hardware Variable,Overnight Labor Varaible Time,Subscription Conversion Rate
0,Fiber,1500.0,1200,5,250,0.2,10,2000,500,1000.0,10.0,0.2
1,2G,0.0625,75,5,20,0.2,2,1200,10,,,0.1
2,3G,2.0,150,5,75,0.2,2,1200,10,,,0.03
3,4G,40.0,300,5,75,0.2,2,1200,10,,,0.005
4,WISP,400.0,1500,10,200,0.2,5,2000,200,,,0.2
5,Sattelite,150.0,4000,5,200,0.2,2,3000,200,,,0.2
6,WAN,400.0,1500,10,200,0.2,5,0,200,,,0.2



 energy parameters


Unnamed: 0,Type,Overnight Hardware,Overnight Labor Regular,Overnight Labor Skilled,Annual Hardware,Annual Labor Regular,Annual Labor Skilled,Daylight
0,Battery,461,0.75,1.25,27.6,0.125,0.075,
1,Solar,1820,4.0,4.0,165.0,2.0,2.0,12.0


We'll use a helper class to help manage all of the Giga parameters for us. Let's load those parameters again.

In [1]:
from giga.utils.parameters import GigaParameters


docid = "1LsOLtcZG8FO9uF79H7Z_PdN6iHkuMyr5TXw3UllbahE"
parameters = GigaParameters.from_google_sheet(docid)
parameters.to_json('default-parameters.json')

## Giga Model
Let's see how we can use the Giga model to generate insights about internet connectivity for schools.

In [3]:
import pandas as pd

from giga.models.giga import GigaNode
import logging

logging.getLogger().setLevel(logging.INFO)


data_file = 'school_data.xlsx'
tiff_file = 'rwanda.tiff'
data = pd.read_excel(data_file, engine='openpyxl')

node = GigaNode.from_giga_parameters('giga-node', parameters, tiff_file)

In [4]:
giga_data = node.run(data, {})

INFO       2021-11-08 00:19:54,096       : Starting giga node
INFO       2021-11-08 00:19:56,291       : School consolidation, inital input with 4234 entries, consolidated to 4187
INFO       2021-11-08 00:19:56,292       : Starting census node
INFO       2021-11-08 00:19:58,366       : Completed nearby school estimate for school use in 10.0 km radius
INFO       2021-11-08 00:20:01,083       : Completed nearby population estimate for school use in 10.0 km radius
INFO       2021-11-08 00:20:01,129       : Completed student estimate
INFO       2021-11-08 00:20:01,174       : Completed teacher estimate
INFO       2021-11-08 00:20:01,218       : Completed classroom estimate
INFO       2021-11-08 00:20:03,254       : Completed nearby school estimate for internet use in 1.0 km radius
INFO       2021-11-08 00:20:04,685       : Completed nearby population estimate for internet use in 1.0 km radius
INFO       2021-11-08 00:20:04,688       : Completed internet use estimate for neabry population
I

## Census Model
The census model computes a census estimate using information about schools, project of interest, and population. The model bundles a number of core Giga nodes to estimate the following for a given school:

* nearby population
* number of nearby households
* number of students at the school
* number of teachers at the school
* number of classrooms at the school

The census estimates can then be directly used to compute internet bandiwth estimates.

In [1]:
import pandas as pd
from giga.models.census import CensusNode


data_file = '../data/school/RW_connectivity_GIGA_GSMA_DistanceNodes.xlsx'
tiff_file = '../data/population/rwanda_2020.tif'
tiff_file = 'rwanda.tiff'

data = pd.read_excel(data_file, engine='openpyxl')

school_age_fraction = 0.354 # 35% of Rwandans between ages 5-19
school_enrollment_fraction = 0.966 # 96.6% of eligible Rwandans enrolled in school
student_teacher_ratio = 30
teacher_classroom_ratio = 1.2
school_use_radius = 10.0 # km
census_radius = 1.0 # km
people_per_household = 6

node = CensusNode('census-node',
                  population_file=tiff_file,
                  school_age_fraction=school_age_fraction, 
                  school_enrollment_fraction=school_enrollment_fraction, 
                  student_teacher_ratio=student_teacher_ratio,
                  teacher_classroom_ratio=teacher_classroom_ratio,
                  people_per_household=people_per_household,
                  school_use_radius=school_use_radius,
                  census_radius=census_radius)

In [2]:
num_students, num_teachers, num_classrooms, nearby_population, nearby_households = node.run(data, {})

INFO       2021-11-09 18:33:47,376       : Starting census node
INFO       2021-11-09 18:33:49,831       : Completed nearby school estimate for school use in 10.0 km radius
INFO       2021-11-09 18:33:53,455       : Completed nearby population estimate for school use in 10.0 km radius
INFO       2021-11-09 18:33:53,503       : Completed student estimate
INFO       2021-11-09 18:33:53,555       : Completed teacher estimate
INFO       2021-11-09 18:33:53,607       : Completed classroom estimate
INFO       2021-11-09 18:33:55,948       : Completed nearby school estimate for internet use in 1.0 km radius
INFO       2021-11-09 18:33:57,369       : Completed nearby population estimate for internet use in 1.0 km radius
INFO       2021-11-09 18:33:57,372       : Completed internet use estimate for neabry population


In [3]:
data['num_students'] = num_students
data['num_teachers'] = num_teachers
data['num_classrooms'] = num_classrooms
data['nearby_population'] = nearby_population
data['nearby_households'] = nearby_households

## Bandwith Nodes
Let's take a look at how the bandwith nodes work. Using the census data above we'll compute the primary and secondary bandwidth requirements for the project. We'll first load in the usage parameters for EMIS and Portal use. 

In [5]:
from giga.utils.parse import fetch_all_params_from_gsheet


docid = "1LsOLtcZG8FO9uF79H7Z_PdN6iHkuMyr5TXw3UllbahE"
params = fetch_all_params_from_gsheet(docid)

emis_params = params['emis']
portal_params = params['portal']

Let's initialize the primary bandwidth node to estimate the bandwidth that's critical for every shcool.

In [6]:
from giga.core.primary_bandwidth import PrimaryBandwithNode

emis_allowable_transfer_time_hrs = 4
peak_hours = 9
internet_browsing_bandwidth = 1 # Mbps
allowable_website_loading_time = 20 # seconds
contention = 25 # number of people sharing a slot of bandwidth

node = PrimaryBandwithNode('bandiwdth-node',
                           emis_usage=emis_params,
                           portal_usage=portal_params,
                           emis_allowable_transfer_time=emis_allowable_transfer_time_hrs,
                           peak_hours=peak_hours,
                           internet_browsing_bandwidth=internet_browsing_bandwidth,
                           allowable_website_loading_time=allowable_website_loading_time,
                           contention=contention)

In [7]:
bw = node.run(data, {})
data['bandwidth'] = bw

Let's now look at the secondary bandwidth node. This is non-critical internet usage that happens at schools.

In [8]:
from giga.core.secondary_bandwidth import SecondaryBandwithNode
from giga.utils.parameters import GigaParameters


parameters = GigaParameters.from_json('default-parameters.json')

node = SecondaryBandwithNode('non-critical-bandwidth', **vars(parameters))

In [15]:
bw_secondary = node.run(data, {})
data['bandwidth'] += bw_secondary

## Technology Node
The technology node determines what type of tech can support internet at a given school. 

In [4]:
from giga.core.tech import TechnologyNode

name = 'tech-node'
speed_4g = 40 # Mbps
speed_3g = 2 # Mbps
speed_2g = 0.0625 # Mbps

node = TechnologyNode(name, 
                      speed_4g=speed_4g,
                      speed_3g=speed_3g,
                      speed_2g=speed_2g)

In [3]:
tech = node.run(data, {})
data['technology'] = tech

## Cost Node
The cost nodes uses the technology assignments generated in the `TechnologyNode` to estimate the costs of deploying those tecnologies. To use the node, we'll first load the cost values for the technologies of interest.

In [1]:
from giga.utils.parse import fetch_all_params_from_gsheet


docid = "1LsOLtcZG8FO9uF79H7Z_PdN6iHkuMyr5TXw3UllbahE"
params = fetch_all_params_from_gsheet(docid)

conn_params = params['connectivity']
energy_params = params['energy']

In [3]:
from giga.core.cost import CostEstimateNode

name = 'cost-node'
labor_cost_skilled = 20 # usd/hr
labor_cost_regular = 4 # usd/hr

node = CostEstimateNode(name, 
                        connectivity_params=conn_params, 
                        energy_params=energy_params,
                        labor_cost_skilled=labor_cost_skilled,
                        labor_cost_regular=labor_cost_regular)

In [6]:
overnight, annual = node.run(data, {})
data['overnight_cost'] = overnight
data['annual_cost'] = annual

## Revenue Node
This node estimates potential revenues that would come from deploying connectivity technology.

In [5]:
from giga.utils.parse import fetch_all_params_from_gsheet


docid = "1LsOLtcZG8FO9uF79H7Z_PdN6iHkuMyr5TXw3UllbahE"
params = fetch_all_params_from_gsheet(docid)

conn_params = params['connectivity']

In [7]:
from giga.core.revenue import RevenueNode

fraction_community_using_school_internet = 0.2
subscription_conversion_default = 0.2
income_per_household = 6700.0 # USD/Year
fraction_income_for_communications = 0.02

name = 'revenue-node'
node = RevenueNode(name,
                   connectivity_params=conn_params,
                   subscription_conversion_default=subscription_conversion_default,
                   fraction_community_using_school_internet=fraction_community_using_school_internet,
                   income_per_household=income_per_household,
                   fraction_income_for_communications=fraction_income_for_communications)

In [10]:
annual_revenue = node.run(data, {})
data['annual_revenue'] = annual_revenue

## Business Model Node
Let's take a look at the default business model assesment node. This node uses the estimated revenues and costs of deploying a connectivity technology and identifies schools that may present a viable business case.

In [2]:
from giga.core.business import BusinessModelNode

revenue_over_cost_factor = 5.0 # multiple on revenue over the cost that makes business model worth exploring
name = 'business-node'

node = BusinessModelNode(name, revenue_over_cost_factor=revenue_over_cost_factor)

In [27]:
explore_business_model = node.run(data, {})
data['explore_business_model'] = explore_business_model

## Consolidation Node
This node consolidates schools that are closer than a threshold distance to one another.

In [1]:
import pandas as pd
from giga.core.consolidate import ConsolidationNode

data_file = '../data/school/RW_connectivity_GIGA_GSMA_DistanceNodes.xlsx'
data = pd.read_excel(data_file, engine='openpyxl')

node_name = 'consolidation-node'
node = ConsolidationNode(node_name)

In [3]:
consolidated = node.run(data, {'consolidation_radius': 0.01}) # in km so 10m radius for schools

In [9]:
print(f"Length of orginal dataset: {len(data)}, length after consolidation: {len(data[consolidated])}")

Length of orginal dataset: 4234, length after consolidation: 4187


## Nearest Neighbor Node
An example of using the nearest neighbor node is shown below.

In [3]:
import pandas as pd
from giga.core.knn import NearestNeighborsNode

data_file = '../data/school/RW_connectivity_GIGA_GSMA_DistanceNodes.xlsx'
data = pd.read_excel(data_file, engine='openpyxl')

pt_input = ['Lat', 'Lon']
pts = data[pt_input].to_numpy()

node_name = 'nearest-schools'
node = NearestNeighborsNode(node_name, pts, pt_input=pt_input)

In [4]:
n_nearest_schools = node.run(data, {})
data['nearby_schools'] = n_nearest_schools

## Population Model
The examples below show how a census model can be used to estimate total population in a region, and use that estiamte to infer the number of sudents, teachers, and classrooms in a location of interest.

In [5]:
import pandas as pd
from giga.core.population import PopulationNode

node_name = 'population-census-node'
tiff_file = '../data/population/rwanda_2020.tif'
node = PopulationNode.from_tiff(node_name, tiff_file, lon_input="Lon", lat_input="Lat")

In [6]:
# using data model from above
nearby = node.run(data, {})
data['nearby_population'] = nearby

## School Census Models
We can use the models above to generate data inputs for the school census models that compute the number of students, teachers, and classrooms in a given school. Let's first look at estimating the number of students in a school. We'll need a couple of project parameters to do so - school age fraction and school enrollment fraction.

In [6]:
from giga.core.student import StudentEstimateNode

node_name = 'student-estimate-node'
school_age_fraction = 0.354 # 35% of Rwandans between ages 5-19
school_enrollment_fraction = 0.966 # 96.6% of eligible Rwandans enrolled in school
population_input = 'nearby_population'
school_input = 'nearby_schools'
node = StudentEstimateNode(node_name,
                           school_age_fraction=school_age_fraction,
                           school_enrollment_fraction=school_enrollment_fraction,
                           population_input=population_input,
                           school_input=school_input)

In [9]:
num_students = node.run(data, {})
data['num_students'] = num_students

Now let's estimate the number of teachers in the school using the student estimate.

In [4]:
from giga.core.teacher import TeacherEstimateNode

node_name = 'teacher-estimate-node'
student_teacher_ratio = 30
student_input = 'num_students'
node = TeacherEstimateNode(node_name,
                           student_teacher_ratio=student_teacher_ratio,
                           student_input=student_input)

In [11]:
num_teachers = node.run(data, {})
data['num_teachers'] = num_teachers

Lastly, let's estimate the number of classrooms.

In [11]:
from giga.core.classroom import ClassroomEstimateNode

node_name = 'classroom-estimate-node'
teacher_classroom_ratio = 1.2
teacher_input = 'num_teachers'
node = ClassroomEstimateNode(node_name, teacher_classroom_ratio=teacher_classroom_ratio, teacher_input=teacher_input)

In [13]:
num_classrooms = node.run(data, {})
data['num_classrooms'] = num_classrooms