In [43]:
import pandas as pd

# read the csv in
fire_locations = pd.read_csv('data/Wildland_Fire_Incident_Locations.csv', index_col='OBJECTID', low_memory=False)

# display the csv
# display(fire_locations)

| Attribute | Description |
|---|---|
| SourceOID | The OBJECTID value of the source record in the source dataset providing the attribution. |
| ABCDMisc | A FireCode used by USDA FS to track and compile cost information for emergency IA fire suppression on A, B, C & D size class fires on FS lands. |
| ADSPermissionState | Indicates the permission hierarchy that is currently being applied when a system utilizes the UpdateIncident operation. |
| ContainmentDateTime | The date and time a wildfire was declared contained. |
| ControlDateTime | The date and time a wildfire was declared under control. |
| CreatedBySystem | ArcGIS Server Username of system that created the IRWIN Incident record. |
| IncidentSize | Reported for a fire. The minimum size is 0.1. |
| DiscoveryAcres | An estimate of acres burning when the fire is first reported by the first person to call in the fire.  The estimate should include number of acres within the current perimeter of a specific, individual incident, including unburned and unburnable islands. |
| DispatchCenterID | A unique identifier for a dispatch center responsible for supporting the incident. |
| EstimatedCostToDate | The total estimated cost of the incident to date. |
| FinalAcres | Reported final acreage of incident. |
| FinalFireReportApprovedByTitle | The title of the person that approved the final fire report for the incident. |
| FinalFireReportApprovedByUnit | NWCG Unit ID associated with the individual who approved the final report for the incident. |
| FinalFireReportApprovedDate | The date that the final fire report was approved for the incident. |
| FireBehaviorGeneral | A general category describing how the fire is currently reacting to the influences of fuel, weather, and topography. |
| FireBehaviorGeneral1 | A more specific category further describing the general fire behavior (how the fire is currently reacting to the influences of fuel, weather, and topography). |
| FireBehaviorGeneral2 | A more specific category further describing the general fire behavior (how the fire is currently reacting to the influences of fuel, weather, and topography).  |
| FireBehaviorGeneral3 | A more specific category further describing the general fire behavior (how the fire is currently reacting to the influences of fuel, weather, and topography). |
| FireCause | Broad classification of the reason the fire occurred identified as human, natural or unknown.  |
| FireCauseGeneral | Agency or circumstance which started a fire or set the stage for its occurrence; source of a fire's ignition. For statistical purposes, fire causes are further broken into specific causes.  |
| FireCauseSpecific | A further categorization of each General Fire Cause to indicate more specifically the agency or circumstance which started a fire or set the stage for its occurrence; source of a fire's ignition.  |
| FireCode | A code used within the interagency wildland fire community to track and compile cost information for emergency fire suppression expenditures for the incident.  |
| FireDepartmentID | The U.S. Fire Administration (USFA) has created a national database of Fire Departments.  Most Fire Departments do not have an NWCG Unit ID and so it is the intent of the IRWIN team to create a new field that includes this data element to assist the National Association of State Foresters (NASF) with data collection. |
| FireDiscoveryDateTime | The date and time a fire was reported as discovered or confirmed to exist.  May also be the start date for reporting purposes. |
| FireMgmtComplexity | The highest management level utilized to manage a wildland fire event.  |
| FireOutDateTime | The date and time when a fire is declared out.  |
| FireStrategyConfinePercent | Indicates the percentage of the incident area where the fire suppression strategy of "Confine" is being implemented. |
| FireStrategyFullSuppPercent | Indicates the percentage of the incident area where the fire suppression strategy of "Full Suppression" is being implemented. |
| FireStrategyMonitorPercent | Indicates the percentage of the incident area where the fire suppression strategy of "Monitor" is being implemented. |
| FireStrategyPointZonePercent | Indicates the percentage of the incident area where the fire suppression strategy of "Point Zone Protection" is being implemented. |
| FSJobCode | Specific to the Forest Service, code use to indicate the FS job accounting code for the incident. Usually displayed as 2 char prefix on FireCode. |
| FSOverrideCode | Specific to the Forest Service, code used to indicate the FS override code for the incident.  Usually displayed as a 4 char suffix on FireCode.  For example, if the FS is assisting DOI, an override of 1502 will be used. |
| GACC | "A code that identifies the wildland fire geographic area coordination center (GACC) at the point of origin for the incident. A GACC is a facility used for the coordination of agency or jurisdictional resources in support of one or more incidents within a geographic area." |
| ICS209ReportDateTime | The date and time of the latest approved ICS-209 report. |
| ICS209ReportForTimePeriodFrom | The date and time of the beginning of the time period for the current ICS-209 submission. |
| ICS209ReportForTimePeriodTo | The date and time of the end of the time period for the current ICS-209 submission.   |
| ICS209ReportStatus | The version of the ICS-209 report (initial, update, or final). There should never be more than one initial report, but there can be numerous updates and multiple finals (as determined by business rules). |
| IncidentManagementOrganization | The incident management organization for the incident, which may be a Type 1, 2, or 3 Incident Management Team (IMT), a Unified Command, a Unified Command with an IMT, National Incident Management Organization (NIMO), etc.  This field is null if no team is assigned. |
| IncidentName | The name assigned to an incident. |
| IncidentShortDescription | General descriptive location of the incident such as the number of miles from an identifiable town.  |
| IncidentTypeCategory | The Event Category is a sub-group of the Event Kind code and description. The Event Category breaks down the Event Kind into more specific event categories. |
| IncidentTypeKind | A general, high-level code and description of the types of incidents and planned events to which the interagency wildland fire community responds. |
| InitialLatitude | The latitude of the initial reported point of origin specified in decimal degrees. |
| InitialLongitude | The longitude of the initial reported point of origin specified in decimal degrees. |
| InitialResponseAcres | An estimate of acres burning at the time of initial response (when the IC arrives and performs initial size up) The minimum size must be 0.1. The estimate should include number of acres within the current perimeter of a specific, individual incident, including unburned and unburnable islands. |
| InitialResponseDateTime | The date/time of the initial response to the incident (when the IC arrives and performs initial size up) |
| IrwinID | Unique identifier assigned to each incident record in IRWIN. |
| IsFireCauseInvestigated | Indicates if an investigation is underway or was completed to determine the cause of a fire. |
| IsFSAssisted | Indicates if the Forest Service provided assistance on an incident outside their jurisdiction. |
| IsMultiJurisdictional | Indicates if the incident covers multiple jurisdictions. |
| IsReimbursable | Indicates the cost of an incident may be another agency’s responsibility. |
| IsTrespass | Indicates if the incident is a trespass claim or if a bill will be pursued. |
| IsUnifiedCommand | Indicates whether the incident is being managed under Unified Command.  Unified Command is an application of the ICS used when there is more than one agency with incident jurisdiction or when incidents cross political jurisdictions. Under Unified Command, agencies work together through their designated IC at a single incident command post to establish common objectives and issue a single Incident Action Plan. |
| LocalIncidentIdentifier | A number or code that uniquely identifies an incident for a particular local fire management organization within a particular calendar year. |
| ModifiedBySystem | ArcGIS Server username of system that last modified the IRWIN Incident record. |
| PercentContained | Indicates the percent of incident area that is no longer active. Reference definition in fire line handbook when developing standard. |
| PercentPerimeterToBeContained | Indicates the percent of perimeter left to be completed. This entry is appropriate for full suppression, point/zone protection, and confine fires, or any combination of these strategies. This entry is not used for wildfires managed entirely under a monitor strategy.  (Note: Value is not currently being passed by ICS-209) |
| POOCity | The closest city to the incident point of origin. |
| POOCounty | The County Name identifying the county or equivalent entity at point of origin designated at the time of collection. |
| POODispatchCenterID | A unique identifier for the dispatch center that intersects with the incident point of origin. |
| POOFips | The code which uniquely identifies counties and county equivalents.  The first two digits are the FIPS State code and the last three are the county code within the state. |
| POOJurisdictionalAgency | The agency having land and resource management responsibility for a incident as provided by federal, state or local law. |
| POOJurisdictionalUnit | NWCG Unit Identifier to identify the unit with jurisdiction for the land where the point of origin falls. |
| POOJurisdictionalUnitParentUnit | The unit ID for the parent entity, such as a BLM State Office or USFS Regional Office, that resides over the Jurisdictional Unit. |
| POOLandownerCategory | More specific classification of land ownership within land owner kinds identifying the deeded owner at the point of origin at the time of the incident. |
| POOLandownerKind | Broad classification of land ownership identifying the deeded owner at the point of origin at the time of the incident. |
| POOLegalDescPrincipalMeridian | The principal meridian of the legal description (section, township, range) of the incident at point of origin. |
| POOLegalDescQtr | The quarter section of the legal description (section, township, range) of the incident at point of origin. |
| POOLegalDescQtrQtr | The quarter/quarter section of the legal description (section, township, range) of the incident at point of origin. |
| POOLegalDescRange | The range of the legal description (section, township, range) of the incident at point of origin. |
| POOLegalDescSection | The section of the legal description (section, township, range) of the incident at point of origin. |
| POOLegalDescTownship | The township of the legal description (section, township, range) of the incident at point of origin. |
| POOPredictiveServiceAreaID | The predictive service area ID where the incidents point of origin is location.  Predictive Service Areas (PSAs) are geographic areas of similar climate based on statistical correlation of Remote Automated Weather Stations (RAWS) data. |
| POOProtectingAgency | Indicates the agency that has protection responsibility at the point of origin. |
| POOProtectingUnit | "NWCG Unit responsible for providing direct incident management and services to a an incident pursuant to its jurisdictional responsibility or as specified by law, contract or agreement.                                                                                                               Definition Extension:  - Protection can be re-assigned by agreement.  - The nature and extent of the incident determines protection (for example Wildfire vs. All Hazard.)" |
| POOState | The State alpha code identifying the state or equivalent entity at point of origin. |
| PredominantFuelGroup | The fuel majority fuel model type that best represents fire behavior in the incident area, grouped into one of seven categories. |
| PredominantFuelModel | Describes the types of fuel found within the incident area.   |
| PrimaryFuelModel | The fuel model that best represents the primary carrier of the fire for the reporting period. |
| SecondaryFuelModel | The fuel model which best represents the secondary carrier of the fire for the reporting period. |
| TotalIncidentPersonnel | The total number of personnel assigned. Includes overhead, crewmembers, helicopter crewmember, engine crewmembers, camp crew people, etc. |
| UniqueFireIdentifier | Unique identifier assigned to each wildland fire.  yyyy = calendar year, SSUUUU = POO protecting unit identifier (5 or 6 characters), xxxxxx = local incident identifier (6 to 10 characters)  |
| WFDSSDecisionStatus | Indicates the state of the WFDSS decision and/or if a WFDSS decision has been approved for the incident. This information is helpful in resolving conflicts between incident records. |
| OrganizationalAssessment | The Organizational Assessment is part of the Wildland Fire Risk and Complexity Assessment (RCA) that was implemented by NWCG in January 2014, which guides Agency Administrators in their management organization selection, both in escalating and moderating situations.  The Organizational Assessment can be compared with the current Incident Management Organization and many other incident level data elements over the life of the fire. It may not always match the current Incident Management Organization value. The authority for producing the Organizational Assessment lies with the incident commander (NWCG PMS-210), thus, the Organizational Assessment can change independent of the published decision.  |
| StrategicDecisionPublishDate | The Decision Publish Date represents the date Agency Administrators published (approved) the Strategic Decision document. New decisions can be created and published at any time until the incident has been called out. |
| CreatedOnDateTime_dt | Date/time that the IRWIN Incident record was created. |
| ModifiedOnDateTime_dt | Date/time that the IRWIN Incident record was last modified. |
| IsCpxChild | Indicates whether the incident is part of a Complex or not. "0" for no, "1" for yes. |
| CpxName | The Incident Name of the Complex that is the parent of the incident. |
| CpxID | The IRWIN ID for the Complex record that is the parent of the incident. |
| SourceGlobalID | The GlobalID value of the source record in the source dataset |

In [44]:
# Create dataframe of FirePoints
sortByDate = pd.DataFrame(fire_locations)

# sort by date created in fire system
sortByDate['CreatedOnDateTime_dt'] = pd.to_datetime(sortByDate['CreatedOnDateTime_dt'])
df_sorted = sortByDate.sort_values(by='CreatedOnDateTime_dt')
df_sorted.rename(columns={'X': 'Longitude', 'Y': 'Latitude'}, inplace=True)

# display(df_sorted)

In [45]:
# modifed to remove some columns
df_sorted.drop(['SourceOID', 'ABCDMisc', 'ADSPermissionState', 'CreatedBySystem', 'StrategicDecisionPublishDate', 'IsCpxChild', 'CpxName', 'CpxID', 'SourceGlobalID', 'GlobalID', 'WFDSSDecisionStatus', 'EstimatedFinalCost', 'TotalIncidentPersonnel', 'PredominantFuelGroup', 'PredominantFuelModel', 'PrimaryFuelModel', 'SecondaryFuelModel', 'POOLegalDescSection', 'POOLegalDescTownship', 'POOPredictiveServiceAreaID', 'POOProtectingAgency', 'FinalFireReportApprovedByTitle', 'POOLandownerKind', 'POOLegalDescPrincipalMeridian', 'POOLegalDescQtr', 'POOLegalDescQtrQtr', 'POOLegalDescRange', 'POOFips', 'POOJurisdictionalAgency', 'POOJurisdictionalUnitParentUnit'], axis = 1, inplace = True)
# we might want 'EstimatedFinalCost' and 'TotalIncidentPersonnel'

# We still might want these, still looking for more
# df_sorted.drop(['POOCity', 'POOCounty', 'POODispatchCenterID', 'POOJurisdictionalUnit', 'PercentContained', 'PercentPerimeterToBeContained', 'POOLandownerCategory', 'POOProtectingUnit', 'IncidentTypeCategory', 'IncidentTypeKind', 'InitialLatitude', 'InitialLongitude', 'FireCause', 'FireCauseSpecific', 'IncidentShortDescription'], axis = 1, inplace = True)

# more to remove
df_sorted.drop(['FinalFireReportApprovedByUnit', 'IsUnifiedCommand', 'LocalIncidentIdentifier', 'OrganizationalAssessment', 'IsMultiJurisdictional', 'IsReimbursable', 'IsTrespass', 'FinalFireReportApprovedDate', 'IsFireCauseInvestigated', 'IsFireCodeRequested', 'IsFSAssisted'], axis=1, inplace=True)

# even more to remove
df_sorted.drop(['FireBehaviorGeneral', 'IrwinID', 'FireBehaviorGeneral1', 'FireBehaviorGeneral2', 'FireBehaviorGeneral3', 'ICS209ReportForTimePeriodTo', 'ICS209ReportStatus', 'IncidentManagementOrganization', 'ICS209ReportDateTime', 'ICS209ReportForTimePeriodFrom', 'FireStrategyPointZonePercent', 'FSJobCode', 'FSOverrideCode', 'FireCauseGeneral', 'FireStrategyConfinePercent', 'FireStrategyFullSuppPercent', 'FireStrategyMonitorPercent'], axis=1, inplace=True)

# display(df_sorted)

In [46]:
# Get all datatypes of columns and print them
types = df_sorted.dtypes
print(types)

Longitude                                    float64
Latitude                                     float64
ContainmentDateTime                           object
ControlDateTime                               object
IncidentSize                                 float64
DiscoveryAcres                               float64
DispatchCenterID                              object
EstimatedCostToDate                          float64
FinalAcres                                   float64
FireCause                                     object
FireCauseSpecific                             object
FireCode                                      object
FireDepartmentID                              object
FireDiscoveryDateTime                         object
FireMgmtComplexity                            object
FireOutDateTime                               object
GACC                                          object
IncidentName                                  object
IncidentShortDescription                      

In [47]:
EAST_BOUNDARY = 45.817
WEST_BOUNDARY = 65.817
NORTH_BOUNDARY = -111.929
SOUTH_BOUNDARY = -98.437

# Get the column data by date
date_data = df_sorted['CreatedOnDateTime_dt']

# Filter the data between 2020 and 2022
filtered_df = df_sorted.loc[(df_sorted['FireDiscoveryDateTime'] < '2022-01-01')]

# filtered_df = filtered_df.loc[(df_sorted['Latitude'] > EAST_BOUNDARY)
#                               & (df_sorted['Latitude'] < WEST_BOUNDARY)]

# filtered_df = filtered_df.loc[(df_sorted['Longitude'] < SOUTH_BOUNDARY)
#                               & (df_sorted['Longitude'] > NORTH_BOUNDARY)]

# display the filtered data
# display(filtered_df)

In [48]:
# Create a subset collection for initial front-end use and plotting
# We want to use the following data:
frontend_data = filtered_df[['IncidentName', 'UniqueFireIdentifier', 'FireDiscoveryDateTime', 'InitialResponseAcres', 'CreatedOnDateTime_dt', 'GACC', 'ContainmentDateTime', 'ControlDateTime', 'FireOutDateTime', 'DiscoveryAcres', 'FinalAcres', 'IncidentSize', 'InitialLatitude', 'InitialLongitude', 'Latitude', 'Longitude']]

# display(frontend_data)

# create duplicate condition
duplicates_condition = frontend_data.duplicated(subset=['UniqueFireIdentifier'], keep=False)
duplicate_data = frontend_data[duplicates_condition]

# remove duplicates with fewer information by grouping UniqueFireIdentifier
nan_counts = duplicate_data.isna().sum(axis=1)
duplicate_data['nan_counts'] = nan_counts
idx_to_keep = duplicate_data.groupby(['UniqueFireIdentifier'])['nan_counts'].idxmin()
result = pd.concat([frontend_data[~duplicates_condition], duplicate_data.loc[idx_to_keep]])
result = result.drop(columns='nan_counts')

# display(result)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  duplicate_data['nan_counts'] = nan_counts


In [49]:
# write results to file
fire_without_nan = frontend_data.dropna()

# Including NaN Values, aka empty, so we can consider these values as Unknown when clicked
result.to_csv('output/frontend_fire_data.csv', index=False)
fire_without_nan.to_csv('output/frontend_fire_data_no_nan.csv', index=False)

# Show difference in data
print('Table with NaN:')
display(result)
# print('Table without NaN:')
# display(fire_without_nan)

Table with NaN:


Unnamed: 0_level_0,IncidentName,UniqueFireIdentifier,FireDiscoveryDateTime,InitialResponseAcres,CreatedOnDateTime_dt,GACC,ContainmentDateTime,ControlDateTime,FireOutDateTime,DiscoveryAcres,FinalAcres,IncidentSize,InitialLatitude,InitialLongitude,Latitude,Longitude
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
210209,Grand Pager,2014-AKFAS-411093,2014/05/12 03:06:19+00,0.1,2014-05-21 04:09:45+00:00,AKCC,2014/05/12 05:17:21+00,2014/05/12 05:17:35+00,2014/05/13 01:53:46+00,0.1,0.1,0.1,64.802650,-147.744683,64.802702,-147.745026
207222,Glenn Alps,2014-AKMSS-401077,2014/05/09 03:22:55+00,0.1,2014-05-21 04:10:05+00:00,AKCC,2014/05/09 07:09:00+00,2014/05/09 07:09:31+00,2014/05/10 20:27:38+00,0.1,0.7,0.7,61.102967,-149.662833,61.103002,-149.663023
249529,Johnson Lake,2014-AKKKS-403067,2014/05/05 00:18:25+00,0.1,2014-05-21 04:10:06+00:00,AKCC,2014/05/05 00:28:35+00,2014/05/05 00:28:00+00,2014/05/05 00:33:26+00,0.1,0.1,0.1,60.295033,-151.262467,60.295001,-151.262022
137322,Mile 9 Talkeetna Spur,2014-AKMSS-401137,2014/05/19 21:45:02+00,0.1,2014-05-21 04:10:07+00:00,AKCC,2014/05/19 22:02:00+00,2014/05/19 22:02:31+00,2014/05/25 20:02:25+00,0.1,0.1,0.1,62.422633,-150.081217,62.422602,-150.081024
85378,Jim Lake,2014-AKMSS-401131,2014/05/19 00:01:00+00,0.1,2014-05-21 04:10:07+00:00,AKCC,2014/05/19 16:14:45+00,2014/05/19 16:14:49+00,2014/05/29 23:42:41+00,0.1,1.0,1.0,61.543767,-148.879433,61.543802,-148.879023
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4024,PATRIOT,2021-NDFBA-000067,2021/07/01 19:49:00+00,,2021-07-09 14:47:24+00:00,NRCC,,,,,,,,,47.733341,-102.674177
14739,MOE,2021-NDFBA-000110,2021/11/03 13:10:00+00,,2021-11-03 13:11:14+00:00,NRCC,,,,,,,,,48.993897,-102.529455
44282,UP4DAYZ,2021-NDFTA-000153,2021/04/07 03:30:00+00,,2021-04-07 23:57:02+00:00,NRCC,,,,,,,,,47.919730,-98.904732
70959,PETRO,2021-SDCCA-000086,2021/09/26 00:15:00+00,,2021-09-27 15:07:46+00:00,RMCC,,,,,,,,,44.056396,-99.441120
