# Cyclones Data Retrive from Wikipedia

This notebook we will explore more about the Cyclones, this dataset was retrive from Wikipedia, you can see the full dataset [here](https://docs.google.com/spreadsheets/d/1klGelicpEwqg7dmh8dqGs5izDwF5prhyxzMgAwdpbKg/edit#gid=246943789).


In [130]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [0]:
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt
import seaborn as sns

In [0]:
df = pd.read_csv('/content/drive/My Drive/From_wikipedia/cyclones.csv')

In [133]:
df.head()

Unnamed: 0,Cyclone name,Formed,Dissipated,Highest winds,Lowest pressure,Fatalities,Damage,Areas affected
0,Cyclone_Idai,4 March 2019,21 March 2019,10-minute sustained: 195 km/h (120 mph) 1-minu...,940 hPa (mbar); 27.76 inHg,"≥1,303 total[nb 1][nb 2](Deadliest tropical cy...",≥ $2.2 billion (2019 USD)(Costliest tropical c...,"northern and central Mozambique, Malawi, north..."
1,Cyclone_Gafilo,1 March 2004,18 March 2004,10-minute sustained: 230 km/h (145 mph) 1-minu...,895 hPa (mbar); 26.43 inHg(Record Low in South...,"363 dead, 181 missing",$250 million (2004 USD),Madagascar
2,Cyclone_Nargis,27 April 2008,3 May 2008,3-minute sustained: 165 km/h (105 mph) 1-minut...,962 hPa (mbar); 28.41 inHg,"≥138,373 total(Sixth-deadliest tropical cyclon...",$12.9 billion (2008 USD)(Costliest cyclone rec...,"Bangladesh, Myanmar, India, Sri Lanka, Thailan..."
3,Cyclone_Bola,"February 23, 1988","March 4, 1988",10-minute sustained: 165 km/h (105 mph) 1-minu...,940 hPa (mbar); 27.76 inHg,3 direct,$82 million (1988 USD),"Fiji, Vanuatu, New Zealand"
4,Cyclone_Sidr,"November 11, 2007","November 16, 2007",3-minute sustained: 215 km/h (130 mph) 1-minut...,944 hPa (mbar); 27.88 inHg,"3,447–15,000 total",$1.7 billion (2007 USD),"Andaman Islands, Bangladesh, West Bengal, Nort..."


In [134]:
df.dtypes

Cyclone name       object
Formed             object
Dissipated         object
Highest winds      object
Lowest pressure    object
Fatalities         object
Damage             object
Areas affected     object
dtype: object

We notice that all columns are objects, most specific they are Strings, also we notice some factors:

1. Formed and Dissipated Columns are in String format, we need to change this to TimeStamps.

2. Highest winds had two or three measures of winds, we will separated them so we can get more insights about the winds.

3. Lowest Pressure, have two indicators, first is in Pascal and the second is on Mercury, we only use pascal.

4. Fatalities, some rows have dead, missing other coluns estimations of how many people died at, we will create two columns from fatalities, first column of how many people died and the second column how many people is missing.

5. Damage column express estimates how much the respective cyclone coust to their regions, if we look at they have different years, like 2004, 2008 etc. So let's convert these values to US$ dolars including the inflation rate between these years.

6. Areas affected, let's create a column for every region where was affected:
  * 1 if the region is in areas affected column
  * 0 if the region is not in the areas affected.


### Data Cleaning

## Formed and Dissipated Columns

In [0]:
import datetime as dt

In [136]:
df['Formed']

0                          4 March 2019
1                          1 March 2004
2                         27 April 2008
3                     February 23, 1988
4                     November 11, 2007
5                     February 11, 2007
6                       7 February 2016
7                       26 January 2011
8                       28 October 2015
9                         July 26, 2015
10           18 March 2006 (2006-03-18)
11                       April 24, 2006
12                        26 April 2019
13                         10 June 2019
14            May 10, 2013 (2013-05-10)
15                         June 1, 2007
16                     November 9, 2009
17    2 December 2019 (2 December 2019)
18                     November 5, 2019
19                      October 6, 2018
20                     November 5, 2015
21       November 28, 2004 (2004-11-28)
22                         May 21, 2018
23                       April 15, 2017
Name: Formed, dtype: object

In [137]:
df['Dissipated']

0                           21 March 2019
1                           18 March 2004
2                              3 May 2008
3                           March 4, 1988
4                       November 16, 2007
5                       February 23, 2007
6                            3 March 2016
7                         6 February 2011
8                         4 November 2015
9                          August 2, 2015
10             24 March 2006 (2006-03-25)
11                         April 30, 2006
12                             9 May 2019
13                           19 June 2019
14              May 17, 2013 (2013-05-17)
15                           June 8, 2007
16                      November 11, 2009
17    14 December 2019 (14 December 2019)
18                      November 12, 2019
19                       October 15, 2018
20                      November 10, 2015
21          December 3, 2004 (2004-12-04)
22                           May 27, 2018
23                         April 1

Formed

In [0]:
import dateutil.parser
import re
import unicodedata
def fix_strings(date):
  new_date = date
  new_date = re.sub(r'\([^()]*\)', '', new_date)
  regexp = re.compile(r'/s+', re.UNICODE)
  date_norm = [regexp.sub('', p) for p in new_date]
  clean_data = unicodedata.normalize("NFKD",new_date)
  #if clean_data != dt.datetime.strptime(clean_data, '%d %B, %Y'):
  return clean_data


In [0]:
df['Formed'] = df['Formed'].apply(fix_strings)
df['Formed'] = df['Formed'].str.replace(',','')

df['Dissipated'] = df['Dissipated'].apply(fix_strings)
df['Dissipated'] = df['Dissipated'].str.replace(',', '')

In [140]:
import dateutil.parser
import re

formed = df['Formed'][14]
formed = re.sub(r'\([^()]*\)', '', formed)  
regexp = re.compile(r'\s+', re.UNICODE)
prices_norm = [regexp.sub('', p) for p in formed]

#re.sub(r'\([^)]*\)', '', filename)
#date_string = df['Formed'][10]
#oi = dateutil.parser.parse(date_string)

formed

'May 10 2013 '

In [0]:
def fix_string_to_format(date):
  parts = date.split(" ")

  #print(parts) # ['12', '8', '2017']
  months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'Setepmber', 'October','November', 'December']
  result = ''
  newDate = ''
  if parts[0] in months:
      if len(parts[0]) == 1:
        parts[0] = '0' + parts[0]
      if len(parts[1]) == 1:
        parts[1] = '0' + parts[1]
      if len(parts[2]) == 2:
        parts[2] = '20' + parts[2]
      
      newDate = parts[0] + ' ' + parts[1] + ' ' + parts[2] 
  else:
    if len(parts[0]) == 1 :
        parts[0] = "0" + parts[0]
    if len(parts[1]) == 1:
        parts[1] = "0" + parts[1]
    if len(parts[2]) == 2:
        parts[2] = "20" + parts[2] 
    newDate = "/".join(parts)
    # or 
    newDate = parts[1] + " " + parts[0] + " " + parts[2]
  result = newDate
  return result


In [142]:
df['Formed'] = df['Formed'].apply(fix_string_to_format)
df['Formed']

0        March 04 2019
1        March 01 2004
2        April 27 2008
3     February 23 1988
4     November 11 2007
5     February 11 2007
6     February 07 2016
7      January 26 2011
8      October 28 2015
9         July 26 2015
10       March 18 2006
11       April 24 2006
12       April 26 2019
13        June 10 2019
14         May 10 2013
15        June 01 2007
16    November 09 2009
17    December 02 2019
18    November 05 2019
19     October 06 2018
20    November 05 2015
21    November 28 2004
22         May 21 2018
23       April 15 2017
Name: Formed, dtype: object

In [143]:
df['Dissipated'] = df['Dissipated'].apply(fix_string_to_format)
df['Dissipated']

0        March 21 2019
1        March 18 2004
2          May 03 2008
3        March 04 1988
4     November 16 2007
5     February 23 2007
6        March 03 2016
7     February 06 2011
8     November 04 2015
9       August 02 2015
10       March 24 2006
11       April 30 2006
12         May 09 2019
13        June 19 2019
14         May 17 2013
15        June 08 2007
16    November 11 2009
17    December 14 2019
18    November 12 2019
19     October 15 2018
20    November 10 2015
21    December 03 2004
22         May 27 2018
23       April 17 2017
Name: Dissipated, dtype: object

In [0]:
def parsing_dates(date):
  new_date = dt.datetime.strptime(date, "%B %d %Y")
  return new_date.date()


In [145]:
df['Formed'] = df['Formed'].apply(parsing_dates)
df['Formed']

0     2019-03-04
1     2004-03-01
2     2008-04-27
3     1988-02-23
4     2007-11-11
5     2007-02-11
6     2016-02-07
7     2011-01-26
8     2015-10-28
9     2015-07-26
10    2006-03-18
11    2006-04-24
12    2019-04-26
13    2019-06-10
14    2013-05-10
15    2007-06-01
16    2009-11-09
17    2019-12-02
18    2019-11-05
19    2018-10-06
20    2015-11-05
21    2004-11-28
22    2018-05-21
23    2017-04-15
Name: Formed, dtype: object

In [146]:
df['Dissipated'] = df['Dissipated'].apply(parsing_dates)
df['Dissipated']

0     2019-03-21
1     2004-03-18
2     2008-05-03
3     1988-03-04
4     2007-11-16
5     2007-02-23
6     2016-03-03
7     2011-02-06
8     2015-11-04
9     2015-08-02
10    2006-03-24
11    2006-04-30
12    2019-05-09
13    2019-06-19
14    2013-05-17
15    2007-06-08
16    2009-11-11
17    2019-12-14
18    2019-11-12
19    2018-10-15
20    2015-11-10
21    2004-12-03
22    2018-05-27
23    2017-04-17
Name: Dissipated, dtype: object

In [147]:
df['Duration'] = df['Dissipated'] - df['Formed']
df['Duration']

0    17 days
1    17 days
2     6 days
3    10 days
4     5 days
5    12 days
6    25 days
7    11 days
8     7 days
9     7 days
10    6 days
11    6 days
12   13 days
13    9 days
14    7 days
15    7 days
16    2 days
17   12 days
18    7 days
19    9 days
20    5 days
21    5 days
22    6 days
23    2 days
Name: Duration, dtype: timedelta64[ns]

## Highest Winds


In [0]:
df['3-minute sustained'] = pd.Series()
df['1-minute sustained'] = pd.Series()
df['10-minute sustained'] = pd.Series()
df['Gust'] = pd.Series

In [149]:
df.columns

Index(['Cyclone name', 'Formed', 'Dissipated', 'Highest winds',
       'Lowest pressure', 'Fatalities', 'Damage', 'Areas affected', 'Duration',
       '3-minute sustained', '1-minute sustained', '10-minute sustained',
       'Gust'],
      dtype='object')

In [150]:
col_name = ''
found = False
col_names = []

for col in df['Highest winds']:
  col = col.split(')')
  for value in col:
    for p in value: 
      if p != ":" and found == False:
        col_name += p
      elif p == ":":
        found = True
        col_names.append(col_name)
    col_name = ''
    found = False


print(col_names)
"""
  for val in col[i]:
    for p in val:
      if p != ':' and found == False:
        col_name += p
      elif p == ':':
        found = True 
      elif p != 'k' and found == True:
          value += p 
  i += 1          
print(col_name)
print(value[0:4])
"""

['10-minute sustained', ' 1-minute sustained', ' Gusts', '10-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '10-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '10-minute sustained', ' 1-minute sustained', '10-minute sustained', ' 1-minute sustained', '10-minute sustained', ' 1-minute sustained', ' Gusts', '3-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '10-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', ' Gusts', '3-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '10-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '3-minute sustai

"\n  for val in col[i]:\n    for p in val:\n      if p != ':' and found == False:\n        col_name += p\n      elif p == ':':\n        found = True \n      elif p != 'k' and found == True:\n          value += p \n  i += 1          \nprint(col_name)\nprint(value[0:4])\n"

In [151]:
unique_cols = set(col_names)
unique_cols

{' 1-minute sustained', ' Gusts', '10-minute sustained', '3-minute sustained'}

In [0]:
for col in unique_cols:
  df[col] = pd.Series()


In [0]:
df = df.drop('Gust', axis = 1)

In [0]:
df = df.drop('1-minute sustained', axis = 1)

In [155]:
df['Highest winds']

0     10-minute sustained: 195 km/h (120 mph) 1-minu...
1     10-minute sustained: 230 km/h (145 mph) 1-minu...
2     3-minute sustained: 165 km/h (105 mph) 1-minut...
3     10-minute sustained: 165 km/h (105 mph) 1-minu...
4     3-minute sustained: 215 km/h (130 mph) 1-minut...
5     10-minute sustained: 195 km/h (120 mph) 1-minu...
6     10-minute sustained: 280 km/h (175 mph) 1-minu...
7     10-minute sustained: 205 km/h (125 mph) 1-minu...
8     3-minute sustained: 215 km/h (130 mph) 1-minut...
9     3-minute sustained: 75 km/h (45 mph) 1-minute ...
10    10-minute sustained: 185 km/h (115 mph) 1-minu...
11    3-minute sustained: 185 km/h (115 mph) 1-minut...
12    3-minute sustained: 215 km/h (130 mph) 1-minut...
13    3-minute sustained: 150 km/h (90 mph) 1-minute...
14    3-minute sustained: 85 km/h (50 mph) 1-minute ...
15    3-minute sustained: 240 km/h (150 mph) 1-minut...
16    3-minute sustained: 85 km/h (50 mph) 1-minute ...
17    10-minute sustained: 155 km/h (100 mph) 1-

In [156]:
unique_cols
winds_dict = {}

for wind in unique_cols:
  winds_dict[wind] = 0

winds_dict
unique_cols = list(unique_cols)
unique_cols

['10-minute sustained', ' 1-minute sustained', '3-minute sustained', ' Gusts']

In [157]:

value = ''
col_name = ''
found = False
col_names2 = []
values = []
col = df['Highest winds'][10]
print(col)
col = col.split(')')
print(col)
i = 0
one_minute = []
three_minute = []
ten_minute = []
gust = []
count_length = 0
one_bool = False
three_bool = False
gusts_bool = False
ten_bool = False
bool_list = []

for col in df['Highest winds']:
  col = col.split(')')
  for value in col:  
      if(value == '' or value == ' '):
          if(one_bool == False and three_bool == False):
            one_minute.append(0)
            three_minute.append(0)
            one_bool = True 
            three_bool = True
          elif(one_bool == False and gusts_bool == False):
            one_minute.append(0)
            gust.append(0)
            gust_bool = True
            one_bool = True 
          elif(one_bool == False and ten_bool == False):
            one_minute.append(0)
            ten.append(0)
            one_bool = True
            ten_bool = True  
          elif(three_bool == False and gusts_bool == False):
            three_minute.append(0)
            gust.append(0)
            three_bool = True
            gusts_bool = True
          elif(three_bool == False and ten_bool == False):
            three_minute.append(0)
            ten_minute.append(0)
            three_bool = True
            ten_bool = True
          elif(gusts_bool == False and ten_bool == False):
            ten_minute.append(0)
            gust.append(0)
            gusts_bool = True
            ten_bool = True
          elif(ten_bool == False and gusts_bool == False):
            ten_minute.append(0)
            gust.append(0)
            gust_bool = True
            ten_bool = True
          else:
              if(one_bool == False):
                one_minute.append(0)
              elif(three_bool == False):
                three_minute.append(0)
              elif(gusts_bool == False):
                gust.append(0)
              else:
                ten_minute.append(0)
          one_bool = False
          three_bool = False
          gusts_bool = False
          ten_bool = False
      for p in value:
          if p != ':' and found == False:
            col_name += p
            value = ''
          elif p == ":" and found == False:
            found = True
            col_names2.append(col_name)
          else:
            d = p
            if p != "k" and found == True:
              value += d
            elif p == 'k' and found == True:
              values.append(value)
              value = value.strip()
              integer_value = int(value)
              if(col_name == ' 1-minute sustained'):
                    one_minute.append(integer_value)
                    one_bool = True
                    bool_list.append(one_bool)
              elif(col_name == ' Gusts'):
                    gust.append(integer_value)
                    gusts_bool = True
              elif(col_name == '3-minute sustained'):
                    three_minute.append(integer_value)
                    three_bool = True 
              elif(col_name == '10-minute sustained'):
                    ten_minute.append(integer_value)
                    ten_bool = True
      
      col_name = ''
      found = False 
print(one_minute)
print(three_minute)
print(ten_minute)
print(gust)
print(col_names2)
print(bool_list)

10-minute sustained: 185 km/h (115 mph) 1-minute sustained: 215 km/h (130 mph) 
['10-minute sustained: 185 km/h (115 mph', ' 1-minute sustained: 215 km/h (130 mph', ' ']
[205, 260, 215, 195, 260, 220, 285, 250, 240, 85, 215, 220, 250, 185, 85, 270, 95, 185, 155, 155, 205, 120, 185, 85]
[0, 0, 165, 0, 215, 0, 0, 0, 215, 75, 0, 185, 215, 150, 85, 240, 85, 0, 140, 140, 175, 100, 175, 75]
[195, 230, 0, 165, 0, 195, 280, 205, 0, 0, 185, 0, 0, 0, 0, 0, 0, 155, 0, 0, 0, 0, 0, 0]
[280, 0, 0, 0, 0, 0, 0, 285, 0, 0, 0, 0, 305, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
['10-minute sustained', ' 1-minute sustained', ' Gusts', '10-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '10-minute sustained', ' 1-minute sustained', '3-minute sustained', ' 1-minute sustained', '10-minute sustained', ' 1-minute sustained', '10-minute sustained', ' 1-minute sustained', '10-minute sustained', ' 1-minute sustained', ' Gusts', '3-minute sustained', ' 1-minute sustained', '3-minute su

In [0]:
df['1-minute sustained'] = pd.Series(one_minute)

In [0]:
df['3-minute sustained'] = pd.Series(three_minute)

In [0]:
df['10-minute sustained'] = pd.Series(ten_minute)

In [0]:
df[' Gusts'] = pd.Series(gust)

In [0]:
df = df.drop(' 1-minute sustained', axis = 1)

In [163]:
df

Unnamed: 0,Cyclone name,Formed,Dissipated,Highest winds,Lowest pressure,Fatalities,Damage,Areas affected,Duration,3-minute sustained,10-minute sustained,Gusts,1-minute sustained
0,Cyclone_Idai,2019-03-04,2019-03-21,10-minute sustained: 195 km/h (120 mph) 1-minu...,940 hPa (mbar); 27.76 inHg,"≥1,303 total[nb 1][nb 2](Deadliest tropical cy...",≥ $2.2 billion (2019 USD)(Costliest tropical c...,"northern and central Mozambique, Malawi, north...",17 days,0,195,280,205
1,Cyclone_Gafilo,2004-03-01,2004-03-18,10-minute sustained: 230 km/h (145 mph) 1-minu...,895 hPa (mbar); 26.43 inHg(Record Low in South...,"363 dead, 181 missing",$250 million (2004 USD),Madagascar,17 days,0,230,0,260
2,Cyclone_Nargis,2008-04-27,2008-05-03,3-minute sustained: 165 km/h (105 mph) 1-minut...,962 hPa (mbar); 28.41 inHg,"≥138,373 total(Sixth-deadliest tropical cyclon...",$12.9 billion (2008 USD)(Costliest cyclone rec...,"Bangladesh, Myanmar, India, Sri Lanka, Thailan...",6 days,165,0,0,215
3,Cyclone_Bola,1988-02-23,1988-03-04,10-minute sustained: 165 km/h (105 mph) 1-minu...,940 hPa (mbar); 27.76 inHg,3 direct,$82 million (1988 USD),"Fiji, Vanuatu, New Zealand",10 days,0,165,0,195
4,Cyclone_Sidr,2007-11-11,2007-11-16,3-minute sustained: 215 km/h (130 mph) 1-minut...,944 hPa (mbar); 27.88 inHg,"3,447–15,000 total",$1.7 billion (2007 USD),"Andaman Islands, Bangladesh, West Bengal, Nort...",5 days,215,0,0,260
5,Cyclone_Favio,2007-02-11,2007-02-23,10-minute sustained: 195 km/h (120 mph) 1-minu...,925 hPa (mbar); 27.32 inHg,10,$71 million (2007 USD),"Madagascar, Mozambique, Tanzania, Zimbabwe, Ma...",12 days,0,195,0,220
6,Cyclone_Winston,2016-02-07,2016-03-03,10-minute sustained: 280 km/h (175 mph) 1-minu...,884 hPa (mbar); 26.1 inHg(Official record low ...,44 total,$1.4 billion (2016 USD)(Costliest cyclone in t...,"Vanuatu, Fiji, Tonga, Niue, Queensland",25 days,0,280,0,285
7,Cyclone_Yasi,2011-01-26,2011-02-06,10-minute sustained: 205 km/h (125 mph) 1-minu...,929 hPa (mbar); 27.43 inHg,1 indirect,$3.6 billion (2011 USD)(Costliest tropical cyc...,"Tuvalu, Fiji, Solomon Islands, Vanuatu, Papua ...",11 days,0,205,285,250
8,Cyclone_Chapala,2015-10-28,2015-11-04,3-minute sustained: 215 km/h (130 mph) 1-minut...,940 hPa (mbar); 27.76 inHg,9,$100 million (2015 USD),"Oman, Somalia, Yemen",7 days,215,0,0,240
9,Cyclone_Komen,2015-07-26,2015-08-02,3-minute sustained: 75 km/h (45 mph) 1-minute ...,986 hPa (mbar); 29.12 inHg,187–280 (including unrelated flooding),$617.1 million (2015 USD),"Myanmar, Bangladesh, India",7 days,75,0,0,85


## Lowest Preasure

In [164]:
df['Lowest pressure']

0                            940 hPa (mbar); 27.76 inHg
1     895 hPa (mbar); 26.43 inHg(Record Low in South...
2                            962 hPa (mbar); 28.41 inHg
3                            940 hPa (mbar); 27.76 inHg
4                            944 hPa (mbar); 27.88 inHg
5                            925 hPa (mbar); 27.32 inHg
6     884 hPa (mbar); 26.1 inHg(Official record low ...
7                            929 hPa (mbar); 27.43 inHg
8                            940 hPa (mbar); 27.76 inHg
9                            986 hPa (mbar); 29.12 inHg
10                           940 hPa (mbar); 27.76 inHg
11    954 hPa (mbar); 28.17 inHg(Estimated at 922 mb...
12                           932 hPa (mbar); 27.52 inHg
13                           970 hPa (mbar); 28.64 inHg
14                           990 hPa (mbar); 29.23 inHg
15                           920 hPa (mbar); 27.17 inHg
16                           988 hPa (mbar); 29.18 inHg
17                            955 hPa (mbar); 28

In [165]:
import re 

pattern = r'([^\s]+)'
value = df['Lowest pressure'][2]
result = re.search(pattern, value)
result = result.group()
result = int(result)
result



962

In [166]:
def extract_pressure(row):
  pattern = r'([^\s]+)'
  value = re.search(pattern, row)
  value = value.group()
  value = int(value)
  print(value)
  return value 

df['Pressure in Hpa'] = df['Lowest pressure'].apply(extract_pressure)

940
895
962
940
944
925
884
929
940
986
940
954
932
970
990
920
988
955
971
978
964
994
960
996


In [167]:
df['Pressure in Hpa']

0     940
1     895
2     962
3     940
4     944
5     925
6     884
7     929
8     940
9     986
10    940
11    954
12    932
13    970
14    990
15    920
16    988
17    955
18    971
19    978
20    964
21    994
22    960
23    996
Name: Pressure in Hpa, dtype: int64

## Fatalities

In [168]:
df['Fatalities']

0     ≥1,303 total[nb 1][nb 2](Deadliest tropical cy...
1                                 363 dead, 181 missing
2     ≥138,373 total(Sixth-deadliest tropical cyclon...
3                                              3 direct
4                                    3,447–15,000 total
5                                                    10
6                                              44 total
7                                            1 indirect
8                                                     9
9                187–280 (including unrelated flooding)
10                                           1 indirect
11                                             37 total
12                                             89 total
13                                              8 total
14                                 107 total, 6 missing
15                                 78 total, 37 missing
16                                         20 direct[1]
17                                              

In [169]:
s = df['Fatalities'][1]
result = re.findall(r"[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?",s)
deaths = [] 
missing = []
result

s

'363 dead, 181 missing'

In [170]:
pattern = r"[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?"
deaths = [] 
missing = []
for s in df['Fatalities']:
  result = re.findall(pattern,s)
  if ('≥' in s):
    col1 = result[0]
    value =  int(col1.replace(',', ''))
    deaths.append(value)
    missing.append(0)
  if ('–' in s ):
    col1 = result[0]
    col1 = int(col1.replace(',', ''))
    col2 = result[1]
    col2 = int(col2.replace(",", ''))
    mean = int((col2 + col1) / 2)
    deaths.append(mean)
    missing.append(0)
  elif ('[' in s and "≥" not in s):
    death = int(result[0])
    deaths.append(death)
    missing.append(0)
  elif ('missing' in s ):
    death = int(result[0])
    missing_people = int(result[1])
    deaths.append(death)
    missing.append(missing_people)
  elif ("None" in s):
    deaths.append(0)
    missing.append(0)
  elif ('≥' not in s):
    death = int(result[0])
    deaths.append(death)
    missing.append(0)

print(deaths)
print(missing)

[1303, 363, 138373, 3, 9223, 10, 44, 1, 9, 233, 1, 37, 89, 8, 107, 78, 20, 9, 41, 14, 18, 0, 31, 4]
[0, 181, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 37, 0, 0, 0, 0, 0, 0, 0, 0]


In [171]:
print(len(deaths))
print(len(missing))

24
24


In [0]:
df['Deaths'] = pd.Series(deaths)
df['Missing'] = pd.Series(missing)

In [173]:
df

Unnamed: 0,Cyclone name,Formed,Dissipated,Highest winds,Lowest pressure,Fatalities,Damage,Areas affected,Duration,3-minute sustained,10-minute sustained,Gusts,1-minute sustained,Pressure in Hpa,Deaths,Missing
0,Cyclone_Idai,2019-03-04,2019-03-21,10-minute sustained: 195 km/h (120 mph) 1-minu...,940 hPa (mbar); 27.76 inHg,"≥1,303 total[nb 1][nb 2](Deadliest tropical cy...",≥ $2.2 billion (2019 USD)(Costliest tropical c...,"northern and central Mozambique, Malawi, north...",17 days,0,195,280,205,940,1303,0
1,Cyclone_Gafilo,2004-03-01,2004-03-18,10-minute sustained: 230 km/h (145 mph) 1-minu...,895 hPa (mbar); 26.43 inHg(Record Low in South...,"363 dead, 181 missing",$250 million (2004 USD),Madagascar,17 days,0,230,0,260,895,363,181
2,Cyclone_Nargis,2008-04-27,2008-05-03,3-minute sustained: 165 km/h (105 mph) 1-minut...,962 hPa (mbar); 28.41 inHg,"≥138,373 total(Sixth-deadliest tropical cyclon...",$12.9 billion (2008 USD)(Costliest cyclone rec...,"Bangladesh, Myanmar, India, Sri Lanka, Thailan...",6 days,165,0,0,215,962,138373,0
3,Cyclone_Bola,1988-02-23,1988-03-04,10-minute sustained: 165 km/h (105 mph) 1-minu...,940 hPa (mbar); 27.76 inHg,3 direct,$82 million (1988 USD),"Fiji, Vanuatu, New Zealand",10 days,0,165,0,195,940,3,0
4,Cyclone_Sidr,2007-11-11,2007-11-16,3-minute sustained: 215 km/h (130 mph) 1-minut...,944 hPa (mbar); 27.88 inHg,"3,447–15,000 total",$1.7 billion (2007 USD),"Andaman Islands, Bangladesh, West Bengal, Nort...",5 days,215,0,0,260,944,9223,0
5,Cyclone_Favio,2007-02-11,2007-02-23,10-minute sustained: 195 km/h (120 mph) 1-minu...,925 hPa (mbar); 27.32 inHg,10,$71 million (2007 USD),"Madagascar, Mozambique, Tanzania, Zimbabwe, Ma...",12 days,0,195,0,220,925,10,0
6,Cyclone_Winston,2016-02-07,2016-03-03,10-minute sustained: 280 km/h (175 mph) 1-minu...,884 hPa (mbar); 26.1 inHg(Official record low ...,44 total,$1.4 billion (2016 USD)(Costliest cyclone in t...,"Vanuatu, Fiji, Tonga, Niue, Queensland",25 days,0,280,0,285,884,44,0
7,Cyclone_Yasi,2011-01-26,2011-02-06,10-minute sustained: 205 km/h (125 mph) 1-minu...,929 hPa (mbar); 27.43 inHg,1 indirect,$3.6 billion (2011 USD)(Costliest tropical cyc...,"Tuvalu, Fiji, Solomon Islands, Vanuatu, Papua ...",11 days,0,205,285,250,929,1,0
8,Cyclone_Chapala,2015-10-28,2015-11-04,3-minute sustained: 215 km/h (130 mph) 1-minut...,940 hPa (mbar); 27.76 inHg,9,$100 million (2015 USD),"Oman, Somalia, Yemen",7 days,215,0,0,240,940,9,0
9,Cyclone_Komen,2015-07-26,2015-08-02,3-minute sustained: 75 km/h (45 mph) 1-minute ...,986 hPa (mbar); 29.12 inHg,187–280 (including unrelated flooding),$617.1 million (2015 USD),"Myanmar, Bangladesh, India",7 days,75,0,0,85,986,233,0


## Damage

In [174]:
df['Damage']

0     ≥ $2.2 billion (2019 USD)(Costliest tropical c...
1                               $250 million (2004 USD)
2     $12.9 billion (2008 USD)(Costliest cyclone rec...
3                                $82 million (1988 USD)
4                               $1.7 billion (2007 USD)
5                                $71 million (2007 USD)
6     $1.4 billion (2016 USD)(Costliest cyclone in t...
7     $3.6 billion (2011 USD)(Costliest tropical cyc...
8                               $100 million (2015 USD)
9                             $617.1 million (2015 USD)
10                              $1.1 billion (2006 USD)
11                              $6.7 million (2006 USD)
12                              $8.1 billion (2019 USD)
13                                  $140,000 (2019 USD)
14                           > $35.3 million (2013 USD)
15                              $4.4 billion (2007 USD)
16                              $300 million (2009 USD)
17                             > $25 million (20

In [176]:
df['Damage'][21]

'None'

In [0]:
def current_damage(value):
  pattern = r'-?\d+\.?\d*'
  result = re.findall(pattern, value)
  inflation_rate = {
      '2020': 258.678,
      '2019': 255.6575,
      '2018':250.5,
      '2017': 245.1,
      '2016': 240,
      '2015': 237,
      '2014': 236.7,
      '1988':	118.3,
      '1989':	124,
      '1990': 130.7,	
      '1991':	136.2,	
      '1992':	140.3,	
      '1993':	144.5,	
      '1994':	148.2,	
      '1995':	152.4,
      '1996':	156.9,
      '1997':	160.5,	
      '1998':	163,
      '1999':	166.6,	
      '2000':	172.2,
      '2001':	177.1,	
      '2002':	179.9,	
      '2003':	184,	
      '2004':	188.9,	
      '2005':	195.3,	
      '2006':	201.6,	
      '2007':	207.3,	
      '2008':	215.3,	
      '2009':	214.5,	
      '2010':	218.1,	
      '2011':	224.9,	
      '2012':	229.6,	
      '2013':	233
  }
  actual_damage = []
  inflation = 0
  if ('million' in value):
    if('.' in value):
      million_value = result[0]
      million_value = million_value.replace('.','')
      million_value += '000000'
    else:
      million_value = result[0]
      million_value += '000000'
    current_value = float(million_value)
    year = result[1]
    if(year in inflation_rate):
      #print('oi')
      inflation = inflation_rate['2020'] / inflation_rate[year]
      current_value = current_value * inflation
      current_value = round(current_value, 2)
      inflation = current_value
      #actual_damage.append(current_value)
    #print(type(million_value))
  elif ('billion' in value):
    if('.' in value):
      billion_value = result[0]
      billion_value  = billion_value.replace('.','')
      billion_value  += '000000000'
      #print(billion_value)
    else:
      billion_value  = result[0]
      billion_value  += '000000000'
    current_value = float(billion_value)
    year = result[1]
    if(year in inflation_rate):
      inflation = inflation_rate['2020'] / inflation_rate[year]
      current_value = current_value * inflation
      current_value = round(current_value, 2)
      inflation = current_value
    #actual_damage.append(current_value)
  elif ('Unknown' in value or 'None' in value):
    #actual_damage.append(0)
    inflation = 0
  else:
      thousand_value = result[0] + result[1]
      year = result[2]
      thousand_value = float(thousand_value)
      if (year in inflation_rate):
        inflation = inflation_rate['2020'] / inflation_rate[year]
        current_value = thousand_value * inflation
        current_value = round(current_value, 2)
        #actual_damage.append(current_value)
        inflation = current_value
  return inflation
df['Actual Damage'] = df['Damage'].apply(current_damage)

In [178]:
df['Actual Damage']

0     2.225992e+10
1     3.423478e+08
2     1.549905e+11
3     1.793034e+08
4     2.121334e+10
5     8.859690e+07
6     1.508955e+10
7     4.140688e+10
8     1.091468e+08
9     6.735451e+09
10    1.411438e+10
11    8.596938e+07
12    8.195699e+10
13    1.416540e+05
14    3.919027e+08
15    5.490512e+10
16    3.617874e+08
17    2.529537e+07
18    3.419933e+11
19    1.032647e+09
20    0.000000e+00
21    0.000000e+00
22    1.548970e+10
23    2.469631e+04
Name: Actual Damage, dtype: float64

## Areas affected


In [179]:
df['Areas affected']

0     northern and central Mozambique, Malawi, north...
1                                            Madagascar
2     Bangladesh, Myanmar, India, Sri Lanka, Thailan...
3                            Fiji, Vanuatu, New Zealand
4     Andaman Islands, Bangladesh, West Bengal, Nort...
5     Madagascar, Mozambique, Tanzania, Zimbabwe, Ma...
6                Vanuatu, Fiji, Tonga, Niue, Queensland
7     Tuvalu, Fiji, Solomon Islands, Vanuatu, Papua ...
8                                  Oman, Somalia, Yemen
9                            Myanmar, Bangladesh, India
10                                 Far North Queensland
11          Andaman Islands, Myanmar, Northern Thailand
12    Odisha, West Bengal, Andhra Pradesh, East Indi...
13                      Maldives, India, Pakistan, Oman
14    \nIndonesia\nSri Lanka\nIndia\nThailand\nMyanm...
15           Oman, United Arab Emirates, Iran, Pakistan
16                           Sri Lanka, India, Pakistan
17             Seychelles, Mayotte, Comoros, Mad

In [180]:
a = df['Areas affected'][1].split(',')
countries = []

for s in df['Areas affected']:
  s = s.replace('\n', ',')
  country = s.split(',')

  for value in country:
    value = value.strip()
    countries.append(value)

set_countries = set(countries)
countries = list(set_countries)
countries = countries[1:]

countries

['northern Madagascar',
 'Comoros',
 'Andaman and Nicobar Islands',
 'Saudi Arabia',
 'Fiji',
 'Queensland',
 'Tanzania',
 'Solomon Islands',
 'India',
 'Tonga',
 'Bhutan',
 'Thailand',
 'northern and central Mozambique',
 'Myanmar',
 'Odisha',
 'United Arab Emirates',
 'New Zealand',
 'Tuvalu',
 'Mozambique',
 'China[citation needed]',
 'Northern Thailand',
 'Mayotte',
 'Andhra Pradesh',
 'Somalia',
 'Yunnan',
 'Iran',
 'East India',
 'Vanuatu',
 'Laos',
 'Sri Lanka',
 'Madagascar',
 'Malawi',
 'Far North Queensland',
 'Eastern India',
 'Indonesia',
 'Oman',
 'Seychelles',
 'Zimbabwe',
 'Pakistan',
 'Maldives',
 'Papua New Guinea',
 'West Bengal',
 'Australia',
 'Yemen',
 'Andaman Islands',
 'Niue',
 'Northeast India',
 'Bangladesh']

In [0]:
for country in countries:
  df[country] = pd.Series()

In [182]:
df[countries]

Unnamed: 0,northern Madagascar,Comoros,Andaman and Nicobar Islands,Saudi Arabia,Fiji,Queensland,Tanzania,Solomon Islands,India,Tonga,Bhutan,Thailand,northern and central Mozambique,Myanmar,Odisha,United Arab Emirates,New Zealand,Tuvalu,Mozambique,China[citation needed],Northern Thailand,Mayotte,Andhra Pradesh,Somalia,Yunnan,Iran,East India,Vanuatu,Laos,Sri Lanka,Madagascar,Malawi,Far North Queensland,Eastern India,Indonesia,Oman,Seychelles,Zimbabwe,Pakistan,Maldives,Papua New Guinea,West Bengal,Australia,Yemen,Andaman Islands,Niue,Northeast India,Bangladesh
0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
3,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
4,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
6,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
9,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [0]:
"""
for country in df[countries]:
  #print(country)
  for c in df['Areas affected']:
    if country in c:
      df[country] = df.iloc
    else:
      print('oi nada')
"""

for country in df[countries]:
  for i in df['Areas affected'].index:
      c= df['Areas affected'][i]
      if(country in c):
         df.at[i, country] = 1
      else:
        df.at[i, country] = 0
     

In [184]:
df

Unnamed: 0,Cyclone name,Formed,Dissipated,Highest winds,Lowest pressure,Fatalities,Damage,Areas affected,Duration,3-minute sustained,10-minute sustained,Gusts,1-minute sustained,Pressure in Hpa,Deaths,Missing,Actual Damage,northern Madagascar,Comoros,Andaman and Nicobar Islands,Saudi Arabia,Fiji,Queensland,Tanzania,Solomon Islands,India,Tonga,Bhutan,Thailand,northern and central Mozambique,Myanmar,Odisha,United Arab Emirates,New Zealand,Tuvalu,Mozambique,China[citation needed],Northern Thailand,Mayotte,Andhra Pradesh,Somalia,Yunnan,Iran,East India,Vanuatu,Laos,Sri Lanka,Madagascar,Malawi,Far North Queensland,Eastern India,Indonesia,Oman,Seychelles,Zimbabwe,Pakistan,Maldives,Papua New Guinea,West Bengal,Australia,Yemen,Andaman Islands,Niue,Northeast India,Bangladesh
0,Cyclone_Idai,2019-03-04,2019-03-21,10-minute sustained: 195 km/h (120 mph) 1-minu...,940 hPa (mbar); 27.76 inHg,"≥1,303 total[nb 1][nb 2](Deadliest tropical cy...",≥ $2.2 billion (2019 USD)(Costliest tropical c...,"northern and central Mozambique, Malawi, north...",17 days,0,195,280,205,940,1303,0,22259920000.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Cyclone_Gafilo,2004-03-01,2004-03-18,10-minute sustained: 230 km/h (145 mph) 1-minu...,895 hPa (mbar); 26.43 inHg(Record Low in South...,"363 dead, 181 missing",$250 million (2004 USD),Madagascar,17 days,0,230,0,260,895,363,181,342347800.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Cyclone_Nargis,2008-04-27,2008-05-03,3-minute sustained: 165 km/h (105 mph) 1-minut...,962 hPa (mbar); 28.41 inHg,"≥138,373 total(Sixth-deadliest tropical cyclon...",$12.9 billion (2008 USD)(Costliest cyclone rec...,"Bangladesh, Myanmar, India, Sri Lanka, Thailan...",6 days,165,0,0,215,962,138373,0,154990500000.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
3,Cyclone_Bola,1988-02-23,1988-03-04,10-minute sustained: 165 km/h (105 mph) 1-minu...,940 hPa (mbar); 27.76 inHg,3 direct,$82 million (1988 USD),"Fiji, Vanuatu, New Zealand",10 days,0,165,0,195,940,3,0,179303400.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Cyclone_Sidr,2007-11-11,2007-11-16,3-minute sustained: 215 km/h (130 mph) 1-minut...,944 hPa (mbar); 27.88 inHg,"3,447–15,000 total",$1.7 billion (2007 USD),"Andaman Islands, Bangladesh, West Bengal, Nort...",5 days,215,0,0,260,944,9223,0,21213340000.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,1.0
5,Cyclone_Favio,2007-02-11,2007-02-23,10-minute sustained: 195 km/h (120 mph) 1-minu...,925 hPa (mbar); 27.32 inHg,10,$71 million (2007 USD),"Madagascar, Mozambique, Tanzania, Zimbabwe, Ma...",12 days,0,195,0,220,925,10,0,88596900.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Cyclone_Winston,2016-02-07,2016-03-03,10-minute sustained: 280 km/h (175 mph) 1-minu...,884 hPa (mbar); 26.1 inHg(Official record low ...,44 total,$1.4 billion (2016 USD)(Costliest cyclone in t...,"Vanuatu, Fiji, Tonga, Niue, Queensland",25 days,0,280,0,285,884,44,0,15089550000.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
7,Cyclone_Yasi,2011-01-26,2011-02-06,10-minute sustained: 205 km/h (125 mph) 1-minu...,929 hPa (mbar); 27.43 inHg,1 indirect,$3.6 billion (2011 USD)(Costliest tropical cyc...,"Tuvalu, Fiji, Solomon Islands, Vanuatu, Papua ...",11 days,0,205,285,250,929,1,0,41406880000.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
8,Cyclone_Chapala,2015-10-28,2015-11-04,3-minute sustained: 215 km/h (130 mph) 1-minut...,940 hPa (mbar); 27.76 inHg,9,$100 million (2015 USD),"Oman, Somalia, Yemen",7 days,215,0,0,240,940,9,0,109146800.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0
9,Cyclone_Komen,2015-07-26,2015-08-02,3-minute sustained: 75 km/h (45 mph) 1-minute ...,986 hPa (mbar); 29.12 inHg,187–280 (including unrelated flooding),$617.1 million (2015 USD),"Myanmar, Bangladesh, India",7 days,75,0,0,85,986,233,0,6735451000.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0


In [0]:
df.to_csv('cyclones_cleaned.csv', index=False)  