### 3) Prep Release Data for EwE 

Sep 2020 By: G Oldford

Data In: 
   
    Actual_releases_COORDS.csv - has coordinates and other metadata - note differences between 'STOCK_...' and 'REL_...' fields are 'stock' and 'release' (stock location may not be release location)
    

Original Data (see previous notebook):

    EPAD data from Carl (DFO / SEP) - has no coordinates - 'actual_releases.csv'
    RMIS release location data (edits by SOGDC) - has coordinates - 'rmis_locations_2019.csv'
    PSF release and release location data - has coordinates - 'PSEReleasesAndLocations2019.csv'
    Coordinate table by Greig - has a few problem coordinates georeferenced with best-guess

Purpose:

    Export a series of CSV (Biomass, Row, Col), one for each month, a set of series of files for each functional group (each species and life stage)
    1. Step 1: extract start and end month of release (take average)
    2. Step 2: Aggregate releases by month, species, and "group" (as defined in PhD work)
    2. Step 3: set a row / col corresponding to the custom, rotated EwE map

Notes:

    EPAD data from Carl Walters and RMIS locations data from SOGDC
    rmis_smolt_releases dataset from 'rmis_releases.csv' from http://sogdatacentre.ca/search-data/spatial-data/ from all_layers->rmis->rmis_smolt_releases
    prioritized effort for coordinate matching on just coho and Chinook


## TOC: <a class="anchor" id="top"></a>
* [1. Read file and inspect](#section-1)
* [2. Fix issues with dates](#section-2)
* [3. Calculate month of release](#section-3)
* [4. Assign model map row / col](#section-4)
* [5. Match releases to the EwE Functional Group](#section-5)
    * [5a) Round 1 matching (release CU)](#section-5a)
    * [5b) Round 2 matching (release CU)](#section-5b)
    * [5c) Round 3 matching: use EWE row / col and species](#section-5c)
    * [5d) Round 4 matching: final clean-up ad hoc](#section-5d)
* [6. Write to File](#section-6)
* [experiments etc](#section-X)

## 1. Read file and inspect <a class="anchor" id="section-1"></a>

In [356]:
# pandas is a library for doing r-like operations in python
import pandas as pd
import numpy as np

# locations table from the SSMSP SOGDC (may have more lats / lons added than source at RMIS)
path = "C:\\Users\\Greig\\Sync\\6. SSMSP Model\\Model Greig\\Data\\1. Salmon\\Hatchery Releases\\EPADHatcherReleasesGST\\MODIFIED\\"
releases_df = pd.read_csv(path + "actual_releases_COORDS.csv")

# table above was edited in ArcMap to find closest model row / cols
# read in result: 
fromarcmap_df = pd.read_csv(path + "allspecies_coordseditedinarcmap.csv")
fromarcmap_df['UniqueID'] = fromarcmap_df['Field1']

In [357]:
print("Quick check - chinook in 1988 should be around 33 mil, with no other years higher")
print(releases_df.loc[(releases_df['SPECIES_NAME']=='Chinook')].groupby(['BROOD_YEAR'])['TotalRelease'].sum())

Quick check - chinook in 1988 should be around 33 mil, with no other years higher
BROOD_YEAR
1967      277630
1968      603964
1969       67326
1970      575466
1971      993309
1972      920059
1973      820762
1974      270419
1975     1666042
1976     2131779
1977     4606310
1978     3832086
1979     6939012
1980     8781684
1981     7235718
1982    10784044
1983    12544617
1984    14707240
1985    19921013
1986    26835027
1987    33371544
1988    32422580
1989    30296389
1990    33235981
1991    27422636
1992    25953848
1993    22662371
1994    24738625
1995    20272197
1996    25675833
1997    21279403
1998    25735150
1999    27732676
2000    23944639
2001    29244438
2002    25376676
2003    25509493
2004    23058617
2005    21611447
2006    21375249
2007    18652127
2008    17963356
2009    17778127
2010    16258001
2011    17595503
2012    14428867
2013    11433004
2014    13597354
2015     7236484
Name: TotalRelease, dtype: int64


In [358]:
releases_df[['UniqueID','SPECIES_NAME','REL_CU_NAME','BIOMASS_MT','START_YR_REL','START_MO_REL','START_DAY_REL','START_DATE','END_YR_REL','END_MO_REL','END_DAY_REL','RELEASE_YEAR']]

Unnamed: 0,UniqueID,SPECIES_NAME,REL_CU_NAME,BIOMASS_MT,START_YR_REL,START_MO_REL,START_DAY_REL,START_DATE,END_YR_REL,END_MO_REL,END_DAY_REL,RELEASE_YEAR
0,0,Chinook,LOWER FRASER RIVER_FA_0.3,0.095770,1981,6.0,29.0,19810629,1981,7.0,4.0,1981
1,1,Chinook,LOWER FRASER RIVER_FA_0.3,0.100720,1981,6.0,29.0,19810629,1981,7.0,4.0,1981
2,2,Chinook,LOWER FRASER RIVER_FA_0.3,0.127346,1982,5.0,,198205,1982,5.0,,1982
3,3,Chinook,LOWER FRASER RIVER_FA_0.3,0.150289,1983,4.0,,198304,1983,4.0,25.0,1983
4,4,Chinook,LOWER FRASER RIVER_FA_0.3,0.172651,1986,6.0,,198606,1986,6.0,6.0,1986
...,...,...,...,...,...,...,...,...,...,...,...,...
21446,21446,Coho,LOWER FRASER,0.022000,2003,4.0,1.0,20030401,2003,6.0,30.0,2003
21447,21447,Coho,LOWER FRASER,0.022000,2004,4.0,1.0,20040401,2004,6.0,30.0,2004
21448,21448,Coho,LOWER FRASER,0.022000,2004,4.0,1.0,20040401,2004,6.0,30.0,2004
21449,21449,Coho,LOWER FRASER,0.022000,2005,4.0,1.0,20050401,2005,6.0,30.0,2005


## 2. Fix issues with dates <a class="anchor" id="section-2"></a>

[BACK TO TOP](#top)

In [359]:
# fix NaN's in days (put day = 15 in this case, month = 1)
import datetime, calendar
import math
releases_df['START_DAY_REL'] = releases_df['START_DAY_REL'].fillna(15)
releases_df['END_DAY_REL'] = releases_df['END_DAY_REL'].fillna(15)
releases_df['START_MO_REL'] = releases_df['START_MO_REL'].fillna(1)
releases_df['END_MO_REL'] = releases_df['END_MO_REL'].fillna(1)

# conver to integers
releases_df['START_DAY_REL'] = releases_df['START_DAY_REL'].astype(int)
releases_df['END_DAY_REL'] = releases_df['END_DAY_REL'].astype(int)
releases_df['START_MO_REL'] = releases_df['START_MO_REL'].astype(int)
releases_df['END_MO_REL'] = releases_df['END_MO_REL'].astype(int)

#print(releases_df[['START_MO_REL','START_DAY_REL','END_MO_REL','END_DAY_REL']])

In [360]:
# helpful datetime functions
def makedate(year_col, month_col, day_col):
    return (datetime.date(year_col, month_col, day_col))

def getavgrel_month(year_start, month_start, day_start,year_end, month_end, day_end):
    
    date_start = makedate(year_start, month_start, day_start)
    date_end = makedate(year_end, month_end, day_end)
    
    date_avg = date_start + ((date_end - date_start)/2)
    return(date_avg.month)

def getavgrel_year(year_start, month_start, day_start,year_end, month_end, day_end):
    
    date_start = makedate(year_start, month_start, day_start)
    date_end = makedate(year_end, month_end, day_end)
    
    date_avg = date_start + ((date_end - date_start)/2)
    return(date_avg.year)

def getavgrel_datetime(year_start, month_start, day_start,year_end, month_end, day_end):
    
    date_start = makedate(year_start, month_start, day_start)
    date_end = makedate(year_end, month_end, day_end)
    
    date_avg = date_start + ((date_end - date_start)/2)
    return(date_avg)

## 3. Calculate avg month of release <a class="anchor" id="section-3"></a>
    - using average of start and end day
    
[BACK TO TOP](#top)

In [361]:
# find avg date of releases
releases_df['release_avg_month'] = releases_df.apply(lambda x: getavgrel_month(x.START_YR_REL,x.START_MO_REL,x.START_DAY_REL,x.END_YR_REL,x.END_MO_REL,x.END_DAY_REL), axis=1)
releases_df['release_avg_year'] = releases_df.apply(lambda x: getavgrel_year(x.START_YR_REL,x.START_MO_REL,x.START_DAY_REL,x.END_YR_REL,x.END_MO_REL,x.END_DAY_REL), axis=1)

#REL_CU_NAME, SPECIES_NAME, release_avg_month, release_avg_year
releases_df['release_avg_month'].unique()

print(releases_df.loc[(releases_df['SPECIES_NAME']=='Chinook')].groupby(['REL_CU_NAME',
                                                                         'release_avg_year',
                                                                         'release_avg_month'])['BIOMASS_MT'].sum().reset_index())

# create a datetime column (python format) for average release date

releases_df['release_avg_date'] = releases_df.apply(lambda x: getavgrel_datetime(x.START_YR_REL,x.START_MO_REL,x.START_DAY_REL,x.END_YR_REL,x.END_MO_REL,x.END_DAY_REL), axis=1)


                    REL_CU_NAME  release_avg_year  release_avg_month  \
0           BOUNDARY BAY_FA_0.3              1984                  4   
1           BOUNDARY BAY_FA_0.3              1985                  4   
2           BOUNDARY BAY_FA_0.3              1986                  4   
3           BOUNDARY BAY_FA_0.3              1987                  4   
4           BOUNDARY BAY_FA_0.3              1987                  5   
...                         ...               ...                ...   
1361  UPPER FRASER RIVER_SP_1.3              2007                  4   
1362  UPPER FRASER RIVER_SP_1.3              2008                  3   
1363  UPPER FRASER RIVER_SP_1.3              2014                  5   
1364  UPPER FRASER RIVER_SP_1.3              2015                  6   
1365  UPPER FRASER RIVER_SP_1.3              2016                  6   

      BIOMASS_MT  
0       0.066300  
1       0.007500  
2       0.036468  
3       0.018560  
4       0.083647  
...          ...  
13

## 4. Assign model map row / col <a class="anchor" id="section-4"></a>
    - have most lats / lons but many are off the marine portion of the map - requires either a script or manual adjustments
    - decided to do it in ArcMap, all manually, including assigning row / cols
    - I began investigating how to do it with Release CU Name (see below) but too many NULLS so did the moving of release points and then the assigning of rows and cols manually. 
    
 [BACK TO TOP](#top)

In [362]:
# this is just a hack first pass (see above)
# row / col for each location
#fraser riv (IFR, LFR): 131, 45
#cowichan (COW): 110, 7
#upper georgia strait (UGS): 34, 12
#georgia strait mainland (UGS): 50, 29
#howe sound (?): 104, 55
releases_df['REL_CU_NAME'].unique()

array(['LOWER FRASER RIVER_FA_0.3', 'LOWER FRASER',
       'EAST VANCOUVER ISLAND-COWICHAN & KOKSILAH_FA_0.x', nan,
       'GEORGIA STRAIT', 'EAST VANCOUVER ISLAND-GEORGIA STRAIT',
       'LOWER THOMPSON_SP_1.2', 'LOWER THOMPSON',
       'MIDDLE FRASER RIVER_SU_1.3',
       'EAST VANCOUVER ISLAND-NANAIMO & CHEMAINUS_FA_0.x',
       'EAST VANCOUVER ISLAND-GEORGIA STRAIT_SU_0.3',
       'UPPER FRASER RIVER_SP_1.3',
       'SOUTHERN BC-CROSS-CU SUPPLEMENTATION EXCLUSION<<BIN>>',
       'GEORGIA STRAIT MAINLAND',
       'SOUTHERN MAINLAND-GEORGIA STRAIT_FA_0.x',
       'HOWE SOUND-BURRARD INLET', 'EAST HOWE SOUND-BURRARD INLET',
       'NORTH THOMPSON',
       'EAST VANCOUVER ISLAND-QUALICUM & PUNTLEDGE_FA_0.x',
       'FRASER-CROSS-CU SUPPLEMENTATION EXCLUSION<<BIN>>',
       'MARIA SLOUGH_SU_0.3', 'FRASER RIVER',
       'MIDDLE FRASER RIVER_SP_1.3', 'INTERIOR FRASER',
       'EAST VANCOUVER ISLAND-GOLDSTREAM_FA_0.x', 'FRASER CANYON',
       'SHUSWAP RIVER_SU_0.3', 'SOUTH THOMPSON', 'BOUN

In [363]:
# there are issues with 31% of the records - empty / NaN / null values for REL_CU_NAME
releases_df['problemswithCUNAME'] = releases_df['REL_CU_NAME'].isnull()
nullvalues = len(releases_df.loc[(releases_df['problemswithCUNAME']==True)])
notnullvalues = len(releases_df.loc[(releases_df['problemswithCUNAME']==False)])
nullvalues / (notnullvalues + nullvalues)

0.3128059297934828

### Summary of matching EWE map row / col to release lat / lon
     - I took all hatchery releases and found a corresponding Row / Col using ArcMap. For most I had to manually find the most reasonable nearest map model row / col. 
     - next step is to re-export as a CSV time series for EwE
     - need to know the EwE functional group code, though


In [13]:
# utility code - not used (did changes in ArcMap)
#df.loc[df.my_channel > 20000, 'my_channel'] = 0

#releases_df.loc[(releases_df['REL_CU_NAME']=='LOWER FRASER RIVER_FA_0.3'),'EWE_ROW'] = 131
#releases_df.loc[(releases_df['REL_CU_NAME']=='LOWER FRASER RIVER_FA_0.3'),'EWE_COL'] = 45
#releases_df.loc[(releases_df['REL_CU_NAME']=='LOWER FRASER'),'EWE_ROW'] = 131
#releases_df.loc[(releases_df['REL_CU_NAME']=='LOWER FRASER'),'EWE_COL'] = 45
#releases_df.columns

In [364]:
#releases_df['UniqueID'] = releases_df[.index]
releases_df.head()

Unnamed: 0,UniqueID,AVE_WEIGHT,BIOMASS_MT,BROOD_YEAR,COORD_SourceRound,END_DATE,END_DAY_REL,END_MO_REL,END_YR_REL,FACILITY_NAME,...,region,rmis_latitude,rmis_longitude,rpagency,source,submission,release_avg_month,release_avg_year,release_avg_date,problemswithCUNAME
0,0,5.0,0.09577,1980,,19810704,4,7,1981,Smokehouse H,...,FRTH,49.2324,-121.9379,CDFO,O,2019-02-04,7,1981,1981-07-01,False
1,1,5.0,0.10072,1980,,19810704,4,7,1981,Smokehouse H,...,FRTH,49.2324,-121.9379,CDFO,O,2019-02-04,7,1981,1981-07-01,False
2,2,1.6,0.127346,1981,,198205,15,5,1982,Smokehouse H,...,FRTH,49.2189,-121.9451,CDFO,O1,2019-02-04,5,1982,1982-05-15,False
3,3,2.142764,0.150289,1982,,19830425,25,4,1983,Smokehouse H,...,FRTH,49.2189,-121.9451,CDFO,O1,2019-02-04,4,1983,1983-04-20,False
4,4,2.8,0.172651,1985,,19860606,6,6,1986,Smokehouse H,...,FRTH,49.2189,-121.9451,CDFO,O1,2019-02-04,6,1986,1986-06-10,False


In [365]:
# join
releases_ewerowscols = pd.merge(releases_df, fromarcmap_df, on=['UniqueID'], how='left')
# these columns should match except I've inserted 15 when start_day_rel = 0
releases_ewerowscols[['START_DAY_REL','START_DAY_']]

Unnamed: 0,START_DAY_REL,START_DAY_
0,29,29.0
1,29,29.0
2,15,0.0
3,15,0.0
4,15,0.0
...,...,...
21446,1,1.0
21447,1,1.0
21448,1,1.0
21449,1,1.0


In [366]:
releases_ewerowscols.columns

Index(['UniqueID', 'AVE_WEIGHT', 'BIOMASS_MT', 'BROOD_YEAR',
       'COORD_SourceRound', 'END_DATE', 'END_DAY_REL', 'END_MO_REL',
       'END_YR_REL', 'FACILITY_NAME', 'FID', 'FINAL_LAT', 'FINAL_LON',
       'FeatureType', 'GAZETTED_NAME', 'LATITUDE_PSF', 'LAT_GLO', 'LAT_GLO_x',
       'LAT_GLO_y', 'LONGITUDE_PSF', 'LON_GLO', 'LON_GLO_x', 'LON_GLO_y',
       'MRP_TAGCODE', 'NEW_WATERSHED_CODE', 'NONZERO_MEAN_WEIGHT', 'NoTagClip',
       'NoTagNum', 'NoTagPartMarkNum', 'Note', 'PROGRAM_CODE', 'PROJ_NAME',
       'PURPOSE_CODE', 'REARING_TYPE_CODE', 'RELEASE_COMMENT',
       'RELEASE_SITE_NAME', 'RELEASE_SITE_NAME_G', 'RELEASE_SITE_NAME_x',
       'RELEASE_SITE_NAME_y', 'RELEASE_STAGE_NAME', 'RELEASE_YEAR',
       'REL_CU_INDEX', 'REL_CU_NAME', 'RUN_NAME', 'Release_Site', 'RowNum',
       'SOURCE', 'SPECIES_NAME', 'START_DATE', 'START_DAY_REL', 'START_MO_REL',
       'START_YR_REL', 'STOCK_CU_INDEX', 'STOCK_CU_NAME', 'STOCK_NAME',
       'STOCK_PROD_AREA_CODE', 'STOCK_TYPE_CODE', 'ShedTa

In [367]:
#1/3 of records have no release conservation unit
temp = releases_ewerowscols[['REL_CU_NAME','REL_CU_INDEX']]
#temp.drop_duplicates()
temp[temp['REL_CU_NAME'].notnull()]

# how many have no CU for after 1978? 
temp = releases_ewerowscols[['REL_CU_NAME','REL_CU_INDEX','START_YR_REL']].loc[releases_ewerowscols['START_YR_REL']>1978]
temp[temp['REL_CU_NAME'].notnull()]
# roughly same

# conclusion: the release conservation unit name is empty 1/3 of the time. 
print("the release conservation unit name is empty 1/3 of the time.")

the release conservation unit name is empty 1/3 of the time.


In [368]:
#note that 1/3 of records have no release conservation unit
temp = releases_ewerowscols[['START_YR_REL','GAZETTED_NAME','REL_CU_NAME','REL_CU_INDEX','ROW_EWE',
                             'RELEASE_SITE_NAME','STOCK_NAME','STOCK_CU_INDEX','SPECIES_NAME','STOCK_PROD_AREA_CODE']]

# 'STOCK_PROD_AREA_CODE' will have to be used for records without conservation unit and stock_cu index
temp[(temp['STOCK_CU_INDEX'].isnull())&(temp['REL_CU_INDEX'].isnull())&
     (temp['SPECIES_NAME']!='Steelhead')&(temp['SPECIES_NAME']!='Cutthroat')]

# temp[(temp['STOCK_CU_INDEX'].isnull())&(temp['REL_CU_INDEX'].isnull())&
#      (temp['SPECIES_NAME']!='Steelhead')&(temp['SPECIES_NAME']!='Cutthroat')]

Unnamed: 0,START_YR_REL,GAZETTED_NAME,REL_CU_NAME,REL_CU_INDEX,ROW_EWE,RELEASE_SITE_NAME,STOCK_NAME,STOCK_CU_INDEX,SPECIES_NAME,STOCK_PROD_AREA_CODE
1337,1983,,,,66.0,Gray Cr/GSMN,Maclean Bay,,Chinook,GSMN
1339,1993,,,,66.0,Egmont Point,Maclean Bay,,Chinook,GSMN
1346,1988,,,,66.0,Burnet Cr,Maclean Bay,,Chum,GSMN
1349,1989,,,,66.0,Burnet Cr,Maclean Bay,,Chum,GSMN
1356,1990,,,,66.0,Burnet Cr,Maclean Bay,,Coho,GSMN
...,...,...,...,...,...,...,...,...,...,...
21383,2004,,,,136.0,,Pitt R Up,,Coho,LWFR
21384,2005,,,,136.0,,Pitt R Up,,Coho,LWFR
21385,2006,,,,136.0,,Pitt R Up,,Coho,LWFR
21386,2007,,,,136.0,,Pitt R Up,,Coho,LWFR


## 5) Match releases to the EwE Functional Group <a class="anchor" id="section-5"></a>

[BACK TO TOP](#top)

In [369]:
# EwE functional groups ID and text name as stored in data model
# (actual Ecospace or Ecopath ID will differ)
# Groups: 
# Chinook-H-IFR-2 - Interior Fraser Spring 4_2 / 1.2 (CK-16, CK-17), Fraser Spring 5_2 / 1.3 (CK-10, CK-12, CK-14, CK-18), Fraser Summer 5_2  / 1.3 (CK-09, CK-11, CK-19), Fraser Summer 4_1    / 0.3 (CK-07, CK-13, CK-15); CWT Indicator Stocks: Nicola (NIC), Shuswap (SHU), Middle Shuswap (MSH)
# Chinook-H-LFR-2 - Management Unit and Conservation Units: Fraser Fall 4_1 / 0.3 (CK-2, CK-3, CK-4, CK-5, CK-6, CK-20,CK-9006, CK-9007,CK-9008); CWT Indicator Stocks: Harrison (HAR), Chilliwack (CHI)
# Chinook-H-COW-2 - Management Unit and Conservation Units: Lower Georgia Strait Nanaimo to Cowichan (CK-21, CK-22, CK-25); CWT Indicator Stocks: Cowichan (COW)
# Chinook-H-UGS-2 - Management Unit and Conservation Units: Upper Georgia Strait (CK-28, CK-29, CK-27, CK-83); CWT Indicator Stocks: Quinsam (QUI), Phillips (PHI), Puntledge (PPS), Big Qualicum (BQR)
# Coho-H-IFR-2 - Management Unit and Conservation Units: Interior Fraser (CO-4, CO-5, CO-6, CO-7, CO-8, CO-9, CO-48). CWT Indicator stocks (uncertain - see CW / JK spreadsheet; coldwater, deadman, spius, dunn, louis, lemieux, eagle)
# Coho-H-LFR-2 - Management Unit and Conservation Units: Lower Fraser & Boundary Bay (CO-1, CO-2, CO-3, CO-10, CO-47). CWT Indicator stocks: Inch Cr, Louis (Dunn Creek H), Chilliwack
# Coho-H-UGS-2 - (lat < 49, CO-13, CO-11) Management Unit and Conservation Units: East Coast Vancouver Island + Georgia Strait (CO-11,CO-13); CWT indicator stocks: Quinsam, Big Qualicum, Black, Puntledge, Goldstream
# Coho-H-COW-2 - (lat >= 49, CO-13, CO-11) Management Unit and Conservation Units: East Coast Vancouver Island + Georgia Strait (CO-11, CO-13); CWT indicator stocks: none

# Rearing type codes (REARING_TYPE_CODE)
# H - hatchery, seapen, lakepen, rearing channel.
# W - wild unfed
# F - wild fed (held short term in pens in river prior to release)
# U - unknown

# stock type codes (STOCK_TYPE_CODE)
# H - Hatchery
# W - Wild
# M - Mixed (hatchery and wild)
# U - Unknown

#Souce: Serbic, G. 1992. The Finclip Recovery Database & Reporting System

In [370]:
#releases_ewerowscols[(releases_ewerowscols["REARING_TYPE_CODE"]=='W')&(releases_ewerowscols["SPECIES_NAME"]=='Chinook')]
releases_ewerowscols["REARING_TYPE_CODE"].unique()

array(['H', 'W', 'F'], dtype=object)

### 5a) Round 1 matching
<a class="anchor" id="section-5a"></a>

[BACK TO TOP](#top)

In [371]:
# Round 1 matching: use REL_CU_INDEX (release conservation unit ID) to match releases to EWE model group
# Chinook
releases_ewerowscols["EWE_GROUP_CODE"] = "x"
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  ((releases_ewerowscols["REL_CU_INDEX"] == "CK-16") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-17") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-10") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-12") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-14") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-18") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-09") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-11") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-19") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-7") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-13") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-15")), "Chinook-H-IFR-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  ((releases_ewerowscols["REL_CU_INDEX"] == "CK-2") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-3")|
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-4") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-5") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-6") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-9006")|
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-9007")|
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-9008")|
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-20")), "Chinook-H-LFR-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  ((releases_ewerowscols["REL_CU_INDEX"] == "CK-21") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-22") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-25")), "Chinook-H-COW-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  ((releases_ewerowscols["REL_CU_INDEX"] == "CK-27") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-28") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-29")|
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CK-83")), "Chinook-H-UGS-2", releases_ewerowscols["EWE_GROUP_CODE"])
# Coho
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  ((releases_ewerowscols["REL_CU_INDEX"] == "CO-4") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CO-5") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CO-6") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CO-7") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CO-8") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CO-9")|
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CO-48")), "Coho-H-IFR-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  ((releases_ewerowscols["REL_CU_INDEX"] == "CO-1") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CO-2") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CO-3") |
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CO-10")|
                                                   (releases_ewerowscols["REL_CU_INDEX"] == "CO-47")), "Coho-H-LFR-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  ((releases_ewerowscols["REL_CU_INDEX"] == "CO-13") |
                                                  (releases_ewerowscols["REL_CU_INDEX"] == "CO-11")) & 
                                                  (releases_ewerowscols["FINAL_LAT"] >= 49), "Coho-H-UGS-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["REL_CU_INDEX"] == "CO-13") & 
                                                  (releases_ewerowscols["FINAL_LAT"] < 49), "Coho-H-COW-2", releases_ewerowscols["EWE_GROUP_CODE"])

temp = releases_ewerowscols[['EWE_GROUP_CODE','REARING_TYPE_CODE','START_YR_REL','GAZETTED_NAME','REL_CU_NAME','REL_CU_INDEX','ROW_EWE',
                             'RELEASE_SITE_NAME','STOCK_NAME','STOCK_CU_INDEX','SPECIES_NAME','STOCK_PROD_AREA_CODE']]
print("the number of release records:")
print(print(len(temp[((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook')) & 
            (temp['REARING_TYPE_CODE']=='H')])))

print("the number of unmatched release records:")
print(len(temp[(temp['EWE_GROUP_CODE']=='x') & (temp['REARING_TYPE_CODE']=='H') &
     ((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook'))]))

the number of release records:
14104
None
the number of unmatched release records:
3590


### 5b) Round 2 matching: use stock_cu_name
<a class="anchor" id="section-5b"></a>

[BACK TO TOP](#top)

In [372]:
# Round 1 matching: use STOCK_CU_INDEX (release conservation unit ID) to match releases to EWE model group
# Chinook
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"] == "x") &
                                                  ((releases_ewerowscols["STOCK_CU_INDEX"] == "CK-16") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-17") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-10") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-12") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-14") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-18") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-09") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-11") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-19") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-7") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-13") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-15")), "Chinook-H-IFR-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"] == "x") &
                                                  ((releases_ewerowscols["STOCK_CU_INDEX"] == "CK-2") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-3") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-4") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-5") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-6") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-9006") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-9007") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-9008")|
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-20")), "Chinook-H-LFR-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"] == "x") &
                                                  ((releases_ewerowscols["STOCK_CU_INDEX"] == "CK-21") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-22")|
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-25")), "Chinook-H-COW-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"] == "x") &
                                                  ((releases_ewerowscols["STOCK_CU_INDEX"] == "CK-27") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-28") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-29")|
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CK-83")), "Chinook-H-UGS-2", releases_ewerowscols["EWE_GROUP_CODE"])
# Coho
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"] == "x") &
                                                  ((releases_ewerowscols["STOCK_CU_INDEX"] == "CO-4") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-5") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-6") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-7") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-8") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-9") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-48")), "Coho-H-IFR-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"] == "x") &
                                                  ((releases_ewerowscols["STOCK_CU_INDEX"] == "CO-1") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-2") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-3") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-10") |
                                                   (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-47")), "Coho-H-LFR-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"] == "x") &
                                                  ((releases_ewerowscols["STOCK_CU_INDEX"] == "CO-13") |
                                                  (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-11")) & 
                                                  (releases_ewerowscols["FINAL_LAT"] >= 49), "Coho-H-UGS-2", releases_ewerowscols["EWE_GROUP_CODE"])
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"] == "x") &
                                                  (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-13") & 
                                                  (releases_ewerowscols["FINAL_LAT"] < 49), "Coho-H-COW-2", releases_ewerowscols["EWE_GROUP_CODE"])

temp = releases_ewerowscols[['EWE_GROUP_CODE','REARING_TYPE_CODE','START_YR_REL','GAZETTED_NAME','REL_CU_NAME','REL_CU_INDEX','ROW_EWE',
                             'RELEASE_SITE_NAME','STOCK_NAME','STOCK_CU_INDEX','SPECIES_NAME','STOCK_PROD_AREA_CODE']]
print("the number of release records:")
print(print(len(temp[((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook')) & 
            (temp['REARING_TYPE_CODE']=='H')])))

print("the number of unmatched release records:")
print(len(temp[(temp['EWE_GROUP_CODE']=='x') & (temp['REARING_TYPE_CODE']=='H') &
     ((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook'))]))


temp2 = temp[(temp['EWE_GROUP_CODE']=='x') & 
             (temp['REL_CU_NAME'].notnull()) &
             (temp['REARING_TYPE_CODE'] == 'H') &
     ((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook'))]
temp2['REL_CU_INDEX'].unique()


the number of release records:
14104
None
the number of unmatched release records:
544


array(['CO-13', 'CO-11'], dtype=object)

In [373]:
# some of the remaining unmatched COW and UGS coho releases have no 'final_lat', so use the 'ewe_row' instead
# note rows begin index number in north (northern row = 0)
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  ((releases_ewerowscols["REL_CU_INDEX"] == "CO-13")|
                                                  (releases_ewerowscols["REL_CU_INDEX"] == "CO-11")) & 
                                                  (releases_ewerowscols["ROW_EWE"] < 100), "Coho-H-UGS-2", releases_ewerowscols["EWE_GROUP_CODE"])

releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  ((releases_ewerowscols["REL_CU_INDEX"] == "CO-13") |
                                                  (releases_ewerowscols["REL_CU_INDEX"] == "CO-11") ) & 
                                                  (releases_ewerowscols["ROW_EWE"] >= 100), "Coho-H-COW-2", releases_ewerowscols["EWE_GROUP_CODE"])

releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  ((releases_ewerowscols["STOCK_CU_INDEX"] == "CO-13")|
                                                  (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-11")) & 
                                                  (releases_ewerowscols["ROW_EWE"] < 100), "Coho-H-UGS-2", releases_ewerowscols["EWE_GROUP_CODE"])

releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  ((releases_ewerowscols["STOCK_CU_INDEX"] == "CO-13") |
                                                  (releases_ewerowscols["STOCK_CU_INDEX"] == "CO-11") ) & 
                                                  (releases_ewerowscols["ROW_EWE"] >= 100), "Coho-H-COW-2", releases_ewerowscols["EWE_GROUP_CODE"])


temp = releases_ewerowscols[['EWE_GROUP_CODE','REARING_TYPE_CODE','START_YR_REL','GAZETTED_NAME','REL_CU_NAME','REL_CU_INDEX','ROW_EWE',
                             'RELEASE_SITE_NAME','STOCK_NAME','STOCK_CU_INDEX','SPECIES_NAME','STOCK_PROD_AREA_CODE']]
print("the number of release records:")
print(print(len(temp[((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook')) & 
            (temp['REARING_TYPE_CODE']=='H')])))

print("the number of unmatched release records:")
print(len(temp[(temp['EWE_GROUP_CODE']=='x') & (temp['REARING_TYPE_CODE']=='H') &
     ((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook'))]))


the number of release records:
14104
None
the number of unmatched release records:
362


### 5c) Round 3 matching: use EWE row / col and species
<a class="anchor" id="section-5c"></a>

[BACK TO TOP](#top)

In [374]:
# Rows and cols corresponding to the model were matched earlier to lats and lons of releases
# the small number of unmatched releases can be matched to EWE functional groups using ewe release rows / cols, 
# and species name

# if row <= 90 then release likely can be associated with 'upper strait of georgia' groups

releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  (releases_ewerowscols["ROW_EWE"] <= 90), "Coho-H-UGS-2", releases_ewerowscols["EWE_GROUP_CODE"])

releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  (releases_ewerowscols["ROW_EWE"] <= 90), "Chinook-H-UGS-2", releases_ewerowscols["EWE_GROUP_CODE"])

temp = releases_ewerowscols[['EWE_GROUP_CODE','REARING_TYPE_CODE','START_YR_REL','GAZETTED_NAME','REL_CU_NAME','REL_CU_INDEX','ROW_EWE',
                             'RELEASE_SITE_NAME','STOCK_NAME','STOCK_CU_INDEX','SPECIES_NAME','STOCK_PROD_AREA_CODE']]
print("the number of release records:")
print(print(len(temp[((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook')) & 
            (temp['REARING_TYPE_CODE']=='H')])))

print("the number of unmatched release records:")
print(len(temp[(temp['EWE_GROUP_CODE']=='x') & (temp['REARING_TYPE_CODE']=='H') &
     ((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook'))]))

the number of release records:
14104
None
the number of unmatched release records:
318


### 5d) final matching - ad hoc

<a class="anchor" id="section-5d"></a>

[BACK TO TOP](#top)

In [375]:
# what remains is small (1-2% of records) and unusual releases
temp2 = temp[(temp['EWE_GROUP_CODE']=='x') &
     ((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook'))]

# Some of the records are Chinook Nitinat river stock released in Esquimalt harbour
#print(temp2[temp2['STOCK_CU_INDEX']=="CK-31"])

releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  (releases_ewerowscols["STOCK_CU_INDEX"] =="CK-31"), "Chinook-H-COW-2", releases_ewerowscols["EWE_GROUP_CODE"])

temp = releases_ewerowscols[['EWE_GROUP_CODE','REARING_TYPE_CODE','START_YR_REL','GAZETTED_NAME','REL_CU_NAME','REL_CU_INDEX','ROW_EWE',
                             'RELEASE_SITE_NAME','STOCK_NAME','STOCK_CU_INDEX','SPECIES_NAME','STOCK_PROD_AREA_CODE']]
print("the number of release records:")
print(print(len(temp[((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook')) & 
            (temp['REARING_TYPE_CODE']=='H')])))

print("the number of unmatched release records:")
print(len(temp[(temp['EWE_GROUP_CODE']=='x') & (temp['REARING_TYPE_CODE']=='H') &
     ((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook'))]))

the number of release records:
14104
None
the number of unmatched release records:
290


In [376]:
# use the 'stock_prod_area_code' for the remainder
# - did not use this prior because the stock production area is different from the release area
#   and long distance transplants are common
temp2 = temp[(temp['EWE_GROUP_CODE']=='x') &
     ((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook'))]

# chinook
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  (releases_ewerowscols["STOCK_PROD_AREA_CODE"] =="LWFR"), "Chinook-H-LFR-2", releases_ewerowscols["EWE_GROUP_CODE"])

releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  ((releases_ewerowscols["STOCK_PROD_AREA_CODE"] =="UPFR") |
                                                  (releases_ewerowscols["STOCK_PROD_AREA_CODE"] =="TOMF")), "Chinook-H-IFR-2", releases_ewerowscols["EWE_GROUP_CODE"])

# coho
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  (releases_ewerowscols["STOCK_PROD_AREA_CODE"] =="LWFR"), "Coho-H-LFR-2", releases_ewerowscols["EWE_GROUP_CODE"])

releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  ((releases_ewerowscols["STOCK_PROD_AREA_CODE"] =="UPFR") |
                                                  (releases_ewerowscols["STOCK_PROD_AREA_CODE"] =="TOMF")), "Coho-H-IFR-2", releases_ewerowscols["EWE_GROUP_CODE"])

temp = releases_ewerowscols[['EWE_GROUP_CODE','REARING_TYPE_CODE','START_YR_REL','GAZETTED_NAME',
                             'REL_CU_NAME','REL_CU_INDEX','ROW_EWE','COL_EWE',
                             'RELEASE_SITE_NAME','STOCK_NAME','STOCK_CU_INDEX','SPECIES_NAME','STOCK_PROD_AREA_CODE']]
print("the number of release records:")
print(print(len(temp[((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook')) & 
            (temp['REARING_TYPE_CODE']=='H')])))

print("the number of unmatched release records:")
print(len(temp[(temp['EWE_GROUP_CODE']=='x') & (temp['REARING_TYPE_CODE']=='H') &
     ((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook'))]))

the number of release records:
14104
None
the number of unmatched release records:
103


In [377]:
# the remaining unmatched appear distributed in the south.
# assign the remaining ones to groups purely based on release row / col
# temp2 = temp[(temp['EWE_GROUP_CODE']=='x') & (temp['REARING_TYPE_CODE']=='H') &
#      ((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook'))]
# temp2['COL_EWE'].unique()

# chinook
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  (releases_ewerowscols["ROW_EWE"] > 90) &
                                                  (releases_ewerowscols["COL_EWE"] <= 20), "Chinook-H-COW-2", releases_ewerowscols["EWE_GROUP_CODE"])

releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Chinook") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") &
                                                  (releases_ewerowscols["ROW_EWE"] > 90) &
                                                  (releases_ewerowscols["COL_EWE"] > 20), "Chinook-H-LFR-2", releases_ewerowscols["EWE_GROUP_CODE"])

# coho
releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") & 
                                                  (releases_ewerowscols["ROW_EWE"] > 90) &
                                                  (releases_ewerowscols["COL_EWE"] <= 20), "Coho-H-COW-2", releases_ewerowscols["EWE_GROUP_CODE"])

releases_ewerowscols["EWE_GROUP_CODE"] = np.where((releases_ewerowscols["SPECIES_NAME"] == "Coho") &
                                                  (releases_ewerowscols["REARING_TYPE_CODE"] == "H") &
                                                  (releases_ewerowscols["EWE_GROUP_CODE"]=="x") & 
                                                  (releases_ewerowscols["ROW_EWE"] > 90) &
                                                  (releases_ewerowscols["COL_EWE"] > 20), "Coho-H-LFR-2", releases_ewerowscols["EWE_GROUP_CODE"])


temp = releases_ewerowscols[['EWE_GROUP_CODE','REARING_TYPE_CODE','START_YR_REL','GAZETTED_NAME',
                             'REL_CU_NAME','REL_CU_INDEX','ROW_EWE','COL_EWE',
                             'RELEASE_SITE_NAME','STOCK_NAME','STOCK_CU_INDEX','SPECIES_NAME','STOCK_PROD_AREA_CODE']]

print("the number of release records:")
print(print(len(temp[((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook')) & 
            (temp['REARING_TYPE_CODE']=='H')])))

print("the number of unmatched release records:")
print(len(temp[(temp['EWE_GROUP_CODE']=='x') & (temp['REARING_TYPE_CODE']=='H') &
     ((temp['SPECIES_NAME']=='Coho')|(temp['SPECIES_NAME']=='Chinook'))]))

the number of release records:
14104
None
the number of unmatched release records:
0


In [380]:
releases_ewerowscols.columns

Index(['UniqueID', 'AVE_WEIGHT', 'BIOMASS_MT', 'BROOD_YEAR',
       'COORD_SourceRound', 'END_DATE', 'END_DAY_REL', 'END_MO_REL',
       'END_YR_REL', 'FACILITY_NAME', 'FID', 'FINAL_LAT', 'FINAL_LON',
       'FeatureType', 'GAZETTED_NAME', 'LATITUDE_PSF', 'LAT_GLO', 'LAT_GLO_x',
       'LAT_GLO_y', 'LONGITUDE_PSF', 'LON_GLO', 'LON_GLO_x', 'LON_GLO_y',
       'MRP_TAGCODE', 'NEW_WATERSHED_CODE', 'NONZERO_MEAN_WEIGHT', 'NoTagClip',
       'NoTagNum', 'NoTagPartMarkNum', 'Note', 'PROGRAM_CODE', 'PROJ_NAME',
       'PURPOSE_CODE', 'REARING_TYPE_CODE', 'RELEASE_COMMENT',
       'RELEASE_SITE_NAME', 'RELEASE_SITE_NAME_G', 'RELEASE_SITE_NAME_x',
       'RELEASE_SITE_NAME_y', 'RELEASE_STAGE_NAME', 'RELEASE_YEAR',
       'REL_CU_INDEX', 'REL_CU_NAME', 'RUN_NAME', 'Release_Site', 'RowNum',
       'SOURCE', 'SPECIES_NAME', 'START_DATE', 'START_DAY_REL', 'START_MO_REL',
       'START_YR_REL', 'STOCK_CU_INDEX', 'STOCK_CU_NAME', 'STOCK_NAME',
       'STOCK_PROD_AREA_CODE', 'STOCK_TYPE_CODE', 'ShedTa

In [386]:
temp3 = releases_ewerowscols[['EWE_GROUP_CODE','BIOMASS_MT','release_avg_date','FINAL_LAT','FINAL_LON',
                             'ROW_EWE','COL_EWE']]
temp3[temp3['EWE_GROUP_CODE']=="Chinook-H-LFR-2"]
temp3[temp3['EWE_GROUP_CODE']=="Chinook-H-LFR-2"].to_csv("temp3.csv")


In [353]:
releases_ewerowscols.columns

Index(['UniqueID', 'AVE_WEIGHT', 'BIOMASS_MT', 'BROOD_YEAR',
       'COORD_SourceRound', 'END_DATE', 'END_DAY_REL', 'END_MO_REL',
       'END_YR_REL', 'FACILITY_NAME', 'FID', 'FINAL_LAT', 'FINAL_LON',
       'FeatureType', 'GAZETTED_NAME', 'LATITUDE_PSF', 'LAT_GLO', 'LAT_GLO_x',
       'LAT_GLO_y', 'LONGITUDE_PSF', 'LON_GLO', 'LON_GLO_x', 'LON_GLO_y',
       'MRP_TAGCODE', 'NEW_WATERSHED_CODE', 'NONZERO_MEAN_WEIGHT', 'NoTagClip',
       'NoTagNum', 'NoTagPartMarkNum', 'Note', 'PROGRAM_CODE', 'PROJ_NAME',
       'PURPOSE_CODE', 'REARING_TYPE_CODE', 'RELEASE_COMMENT',
       'RELEASE_SITE_NAME', 'RELEASE_SITE_NAME_G', 'RELEASE_SITE_NAME_x',
       'RELEASE_SITE_NAME_y', 'RELEASE_STAGE_NAME', 'RELEASE_YEAR',
       'REL_CU_INDEX', 'REL_CU_NAME', 'RUN_NAME', 'Release_Site', 'RowNum',
       'SOURCE', 'SPECIES_NAME', 'START_DATE', 'START_DAY_REL', 'START_MO_REL',
       'START_YR_REL', 'STOCK_CU_INDEX', 'STOCK_CU_NAME', 'STOCK_NAME',
       'STOCK_PROD_AREA_CODE', 'STOCK_TYPE_CODE', 'ShedTa

## Write to File<a class="anchor" id="section-6"></a>

[BACK TO TOP](#top)

In [None]:
- to do: I think I should set up an SQLLite database to manage the tables