# Define Individual Segment Context &ndash; Household Use Case #

## Overview ##

Explore the FEC data by specifying SQL predicates that identify **Individual Segments**, which are static lists of Individual (`indiv` table) records.  Note that Individual Segments do not distinguish between constituent Individual records that attempt to represent either the same or different real-world people.  It is up to the definer of an Individual Segment as to the meaning of the collective membership records for a segment.  Note that an Individual Segment context may including one *or more* segments (e.g. by name or ID).  As with Individual and Household contexts, Donor identities are not discernible within queries using this context type.

This approach will create the following query contexts:

**Principal Context View**

* `ctx_iseg`

**Dependent Context Views**

* `ctx_iseg_memb`
* `ctx_indiv`
* `ctx_indiv_contrib`

## Notebook Setup ##

* Configure database connect information and options
* Clear potentially interfering context (PostgreSQL doesn't let you replace a view definition with conflicting column names)
* Set styling for notebook

In [1]:
sqlconnect = "postgresql+psycopg2://crash@localhost/fecdb"

%load_ext sql
%config SqlMagic.autopandas=True
%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'
%sql $sqlconnect

'Connected: crash@fecdb'

Note that we drop *all* context views so we won't have any inconsistencies after this notebook is run.  After defining `ctx_indiv` below, we will define all dependent views (see Overview, above), and leave any higher-order or orthogonal views undefined

In [2]:
%sql drop view if exists ctx_dseg_memb     cascade
%sql drop view if exists ctx_dseg          cascade
%sql drop view if exists ctx_donor_contrib cascade
%sql drop view if exists ctx_donor         cascade
%sql drop view if exists ctx_household     cascade
%sql drop view if exists ctx_iseg_memb     cascade
%sql drop view if exists ctx_iseg          cascade
%sql drop view if exists ctx_indiv_contrib cascade
%sql drop view if exists ctx_indiv         cascade

 * postgresql+psycopg2://crash@localhost/fecdb
Done.
 * postgresql+psycopg2://crash@localhost/fecdb
Done.
 * postgresql+psycopg2://crash@localhost/fecdb
Done.
 * postgresql+psycopg2://crash@localhost/fecdb
Done.
 * postgresql+psycopg2://crash@localhost/fecdb
Done.
 * postgresql+psycopg2://crash@localhost/fecdb
Done.
 * postgresql+psycopg2://crash@localhost/fecdb
Done.
 * postgresql+psycopg2://crash@localhost/fecdb
Done.
 * postgresql+psycopg2://crash@localhost/fecdb
Done.


In [3]:
%%html
<style>
  tr, th, td {
    text-align: left !important;
  }
</style>

## Create Individual Segment (for Household) ##

For this use case, we are defining an Individual Segment representing a household we have previously identified.

Note, we are adding the "epoch" (unixtime) suffix for uniqueness, so we don't always have to clean up between runs

In [4]:
%%sql result <<
select create_indiv_seg(array_agg(i.id), concat('Sandell 9402x Household - ', round(extract(epoch from now()))))
  from indiv i
 where i.name like 'SANDELL, %'
   and i.zip_code ~ '9402[58]'

 * postgresql+psycopg2://crash@localhost/fecdb
1 rows affected.
Returning data to local variable result


In [5]:
indiv_seg_id = int(result.loc[0][0])

8

## Create Principal View (`ctx_iseg`) ##

The context will represent only the single segment, just defined.  Queries written against this context will also work for multi-segment contexts.

In [6]:
%%sql
create or replace view ctx_iseg as
select id,
       name,
       description
  from indiv_seg isg
 where isg.id = :indiv_seg_id

 * postgresql+psycopg2://crash@localhost/fecdb
Done.


In [7]:
%%sql
select *
  from ctx_iseg

 * postgresql+psycopg2://crash@localhost/fecdb
1 rows affected.


Unnamed: 0,id,name,description
0,8,Sandell 9402x Household - 1568437064,


## Create Dependent Views ##

### Create `ctx_iseg_memb` ###

In [8]:
%%sql
create or replace view ctx_iseg_memb as
select ism.*
  from ctx_iseg isx
  join indiv_seg_memb ism on ism.indiv_seg_id = isx.id

 * postgresql+psycopg2://crash@localhost/fecdb
Done.


In [9]:
%%sql
select isg.name as iseg_name,
       i.name   as indiv_name,
       i.city,
       i.state,
       i.zip_code,
       i.elect_cycles
  from ctx_iseg_memb ismx
  join indiv_seg isg on isg.id = ismx.indiv_seg_id
  join indiv i on i.id = ismx.indiv_id

 * postgresql+psycopg2://crash@localhost/fecdb
27 rows affected.


Unnamed: 0,iseg_name,indiv_name,city,state,zip_code,elect_cycles
0,Sandell 9402x Household - 1568437064,"SANDELL, JENNIFER A MS.",MENLO PARK,CA,94025,[2004]
1,Sandell 9402x Household - 1568437064,"SANDELL, JENNIFER",MENLO PARK,CA,94025,"[2004, 2006, 2008, 2010]"
2,Sandell 9402x Household - 1568437064,"SANDELL, JENNIFER MS.",MENLO PARK,CA,94025,[2004]
3,Sandell 9402x Household - 1568437064,"SANDELL, JENNIFER A",MENLO PARK,CA,94025,"[2006, 2008]"
4,Sandell 9402x Household - 1568437064,"SANDELL, JENNIFER AYER",MENLO PARK,CA,94025,"[2004, 2010]"
5,Sandell 9402x Household - 1568437064,"SANDELL, JENNIFER",MENLO PARK,CA,940250,[2004]
6,Sandell 9402x Household - 1568437064,"SANDELL, JENNIFER",PORTOLA VALLEY,CA,940287608,"[2016, 2018, 2020]"
7,Sandell 9402x Household - 1568437064,"SANDELL, JENNIFER",PORTOLA VALLEY,CA,94028,[2018]
8,Sandell 9402x Household - 1568437064,"SANDELL, SCOTT D",MENLO PARK,CA,94025,"[2004, 2006, 2008, 2010]"
9,Sandell 9402x Household - 1568437064,"SANDELL, SCOTT MRS.",MENLO PARK,CA,94025,[2004]


### Create `ctx_indiv` ###

In [10]:
%%sql
create or replace view ctx_indiv as
select i.*
  from ctx_iseg_memb ismx
  join indiv i on i.id = ismx.indiv_id

 * postgresql+psycopg2://crash@localhost/fecdb
Done.


In [11]:
%%sql
select id,
       name,
       city,
       state,
       zip_code,
       elect_cycles
  from ctx_indiv

 * postgresql+psycopg2://crash@localhost/fecdb
27 rows affected.


Unnamed: 0,id,name,city,state,zip_code,elect_cycles
0,10527369,"SANDELL, JENNIFER A MS.",MENLO PARK,CA,94025,[2004]
1,10527363,"SANDELL, JENNIFER",MENLO PARK,CA,94025,"[2004, 2006, 2008, 2010]"
2,10527371,"SANDELL, JENNIFER MS.",MENLO PARK,CA,94025,[2004]
3,10527368,"SANDELL, JENNIFER A",MENLO PARK,CA,94025,"[2006, 2008]"
4,10527370,"SANDELL, JENNIFER AYER",MENLO PARK,CA,94025,"[2004, 2010]"
5,10527364,"SANDELL, JENNIFER",MENLO PARK,CA,940250,[2004]
6,10527366,"SANDELL, JENNIFER",PORTOLA VALLEY,CA,940287608,"[2016, 2018, 2020]"
7,10527365,"SANDELL, JENNIFER",PORTOLA VALLEY,CA,94028,[2018]
8,10527433,"SANDELL, SCOTT D",MENLO PARK,CA,94025,"[2004, 2006, 2008, 2010]"
9,10527447,"SANDELL, SCOTT MRS.",MENLO PARK,CA,94025,[2004]


### Create `ctx_indiv_contrib` ###

In [12]:
%%sql
create or replace view ctx_indiv_contrib as
select ic.*
  from ctx_indiv ix
  join indiv_contrib ic on ic.indiv_id = ix.id

 * postgresql+psycopg2://crash@localhost/fecdb
Done.


In [13]:
%%sql
select count(*)             as contribs,
       sum(transaction_amt) as total_amt,
       array_agg(distinct elect_cycle) as elect_cycles
  from ctx_indiv_contrib

 * postgresql+psycopg2://crash@localhost/fecdb
1 rows affected.


Unnamed: 0,contribs,total_amt,elect_cycles
0,101,264450.0,"[2000, 2002, 2004, 2006, 2008, 2010, 2012, 201..."
