All individuals survived unless noted otherwise.
Entries on the spreadsheet are color-coded.

• Unprovoked Incidents = Tan
• Provoked Incidents = Orange
• Incidents Involving Watercrafts = Green
• Air / Sea Disasters = Yellow
• Questionable Incidents = Blue

Unprovoked vs. Provoked - GSAF defines a provoked incident as one in which the shark was speared, hooked, captured or in which a human drew "first blood". Although such incidents are of little interest to shark behaviorists, when the species of shark involved is known and pre-op photos of the wounds are available, the bite patterns are of value in determining species of shark involved in other cases when the species could not identified by the patient or witnesses. We know that a live human is rarely perceived as prey by a shark. Many incidents are motivated by curiosity, others may result when a shark perceives a human as a threat or competitor for a food source, and could be classed as "provoked" when examined from the shark's perspective.

Incidents involving watercraft – Incidents in which a boat was bitten or rammed by a shark are in green. However, in cases in which the shark was hooked, netted or gaffed, the entry is orange because they are classed as provoked incidents.

Questionable incidents - Incidents in which there are insufficient data to determine if the injury was caused by a shark or the person drowned and the body was later scavenged by sharks. In a few cases, despite media reports to the contrary, evidence indicated there was no shark involvement whatsoever. Such incidents are in blue.

https://www.sharkattackfile.net/incidentlog.htm

In [40]:
import pandas as pd
import matplotlib as plt

In [41]:
shark_df =  df = pd.read_excel("shark_attack_data.xls", na_filter = False, usecols = "B:O", nrows = 6774, na_values = "UNKNOWN")

In [42]:
shark_df.head(10)

Unnamed: 0,Date,Year,Type,Country,Area,Location,Activity,Name,Sex,Age,Injury,Fatal (Y/N),Time,Species
0,03-Jul-2022,2022,Unprovoked,USA,New York,"Smith Point Beach, Suffolk County",Lifeguard Exercises,Zach Gallo,M,,Injuries to chest and right hand,N,10h15,5'shark
1,01-Jul-2022,2022,Unprovoked,EGYPT,"Hurghada, Red Sea Governorate",Sahl Hasheesh,Swimming,Romanian female,F,40s,FATAL,Y,,
2,01-Jul-2022,2022,Unprovoked,EGYPT,"Hurghada, Red Sea Governorate",Sahl Hasheesh,Swimming,Elisabeth Sauer,F,68,FATAL,Y,,2m shark
3,30-Jun-2022,2022,Unprovoked,USA,New York,"Jones Beach, Nassau County",Swimming,male,M,,Injury to right foot,N,13h00,Shark involvement not confirmed
4,30-Jun-2022,2022,Unprovoked,USA,Florida,"Keaton Beach,",Scalloping,Addison Bethea,F,17,Severe bite to right leg,N,15h00,9' shark
5,29-Jun-2022,2022,Unprovoked,USA,Florida,"Sawyer Key, Monroe County",Swimming,Lindsay Rebecca Bruns,F,30s,Laceration to leg,N,20h00,
6,29-Jun-2022,2022,Unprovoked,USA,Florida,"Summerland Key, Monroe County",Jumped into water,male,M,,Laceration to leg,N,Afternoon,
7,28-Jun-2022,2022,Unprovoked,SOUTH AFRICA,Western Cape Province,"Sanctuary Beach, Plettenberg Bay",Swimming,Bruce Wolov,M,,FATAL,Y,14h09,While shark
8,24-Jun-2022,2022,,JA MAICA,,,Fishing,Michael Simpson,M,,Right arm severed,N,Afternoon,"Tiger shark, 15'"
9,23-Jun-2022,2022,Unprovoked,USA,Florida,"Redington Beach, Pinellas County",,female,F,,Leg bitten,N,,


In [43]:
shark_df.tail()

Unnamed: 0,Date,Year,Type,Country,Area,Location,Activity,Name,Sex,Age,Injury,Fatal (Y/N),Time,Species
6769,Before 1903,0,Unprovoked,AUSTRALIA,Western Australia,Roebuck Bay,Diving,male,M,,FATAL,Y,,
6770,Before 1903,0,Unprovoked,AUSTRALIA,Western Australia,,Pearl diving,Ahmun,M,,FATAL,Y,,
6771,1900-1905,0,Unprovoked,USA,North Carolina,Ocracoke Inlet,Swimming,Coast Guard personnel,M,,FATAL,Y,,
6772,1883-1889,0,Unprovoked,PANAMA,,"Panama Bay 8ºN, 79ºW",,Jules Patterson,M,,FATAL,Y,,
6773,1845-1853,0,Unprovoked,CEYLON (SRI LANKA),Eastern Province,"Below the English fort, Trincomalee",Swimming,male,M,15.0,"FATAL. ""Shark bit him in half, carrying away t...",Y,,


In [44]:
shark_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6774 entries, 0 to 6773
Data columns (total 14 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Date         6774 non-null   object
 1   Year         6774 non-null   object
 2   Type         6774 non-null   object
 3   Country      6774 non-null   object
 4   Area         6774 non-null   object
 5   Location     6774 non-null   object
 6   Activity     6774 non-null   object
 7   Name         6774 non-null   object
 8   Sex          6774 non-null   object
 9   Age          6774 non-null   object
 10  Injury       6774 non-null   object
 11  Fatal (Y/N)  6774 non-null   object
 12  Time         6774 non-null   object
 13  Species      6774 non-null   object
dtypes: object(14)
memory usage: 741.0+ KB


In [45]:
shark_df.keys()

Index(['Date', 'Year', 'Type', 'Country', 'Area', 'Location', 'Activity',
       'Name', 'Sex ', 'Age', 'Injury', 'Fatal (Y/N)', 'Time', 'Species '],
      dtype='object')

In [46]:
shark_df.rename(columns = {"Fatal (Y/N)" : "Fatal"}, inplace = True )

In [47]:
shark_df.drop(columns = "Name")

Unnamed: 0,Date,Year,Type,Country,Area,Location,Activity,Sex,Age,Injury,Fatal,Time,Species
0,03-Jul-2022,2022,Unprovoked,USA,New York,"Smith Point Beach, Suffolk County",Lifeguard Exercises,M,,Injuries to chest and right hand,N,10h15,5'shark
1,01-Jul-2022,2022,Unprovoked,EGYPT,"Hurghada, Red Sea Governorate",Sahl Hasheesh,Swimming,F,40s,FATAL,Y,,
2,01-Jul-2022,2022,Unprovoked,EGYPT,"Hurghada, Red Sea Governorate",Sahl Hasheesh,Swimming,F,68,FATAL,Y,,2m shark
3,30-Jun-2022,2022,Unprovoked,USA,New York,"Jones Beach, Nassau County",Swimming,M,,Injury to right foot,N,13h00,Shark involvement not confirmed
4,30-Jun-2022,2022,Unprovoked,USA,Florida,"Keaton Beach,",Scalloping,F,17,Severe bite to right leg,N,15h00,9' shark
...,...,...,...,...,...,...,...,...,...,...,...,...,...
6769,Before 1903,0000,Unprovoked,AUSTRALIA,Western Australia,Roebuck Bay,Diving,M,,FATAL,Y,,
6770,Before 1903,0000,Unprovoked,AUSTRALIA,Western Australia,,Pearl diving,M,,FATAL,Y,,
6771,1900-1905,0000,Unprovoked,USA,North Carolina,Ocracoke Inlet,Swimming,M,,FATAL,Y,,
6772,1883-1889,0000,Unprovoked,PANAMA,,"Panama Bay 8ºN, 79ºW",,M,,FATAL,Y,,


In [48]:
pd.to_numeric(shark_df["Year"])

0       2022.0
1       2022.0
2       2022.0
3       2022.0
4       2022.0
         ...  
6769       0.0
6770       0.0
6771       0.0
6772       0.0
6773       0.0
Name: Year, Length: 6774, dtype: float64

In [49]:
shark_df["Fatal"].value_counts()

N          4695
Y          1435
            556
UNKNOWN      71
 N            7
F             2
M             2
2017          1
y             1
Y x 2         1
Nq            1
N             1
n             1
Name: Fatal, dtype: int64

In [50]:
shark_df["Fatal"].loc[((shark_df["Fatal"] != "N") & (shark_df["Fatal"] != "Y"))]

27            M
28            n
42             
59             
64             
         ...   
6637    UNKNOWN
6638    UNKNOWN
6653           
6704           
6749    UNKNOWN
Name: Fatal, Length: 644, dtype: object

In [51]:
shark_df["Fatal"].loc[((shark_df["Fatal"] == "n")  (shark_df["Fatal"] != "Nq") & (shark_df["Fatal"] != "N"))]

TypeError: 'Series' object is not callable