## REQUERIMENTS BEFORE RUNNING THIS NOTEBOOK

### Download the data first, if you don't...
You must download the following repository: https://github.com/tonilopezrosell/understandingsuicide
There you can find the Data, the notebooks and the README (with the instructions about the definition of the file structure that allows the correct functioning of the code and with the correct order of the execution of the notebooks).

### Libraries needed...

In [5]:
#we need pandas library for data manipulation and analysis
import pandas as pd
#numpy library is for to work with vectors and matrices
import numpy as np
#we have to import os to work with paths
import os

### The correct path...

Establish as a working directory the site where you have saved the folder 'Data' and assign the path of that working directory to the variable 'working_directory'.

In [6]:
pwd

'C:\\Users\\Toni\\Desktop\\understandingsuicide-master\\understandingsuicide-master'

In [7]:
#In my case. I'm going to use forward slash because of the compatibility with Linux and MAC
working_directory = 'C:/Users/Toni/Desktop/understandingsuicide-master/understandingsuicide-master'

### In this notebook...
We will load, clean, select and shape all the groups of files that we have collected about the registered data of unemployment in Spain from 1998 to 2017 by the SEPE

# Monthly unemployment in Spain 1998 - 2001

## 1. Loading the data

In [8]:
unemployment_1998_2001 = pd.read_excel (working_directory + 
                                        '/Data/Unemployment_monthly_series/1996-2005/sispe_TOTAL_NACIONAL_1996-01.xls', 
                                        header = 6, sheet_name= 'cuadro1')
unemployment_1998_2001

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Paro registrado publicado SILE,Unnamed: 6,Variación absoluta SILE,Unnamed: 8,Variación relativa SILE %,Unnamed: 10,Paro registrado simulación SISPE,Unnamed: 12,Variación absoluta SISPE,Unnamed: 14,Variación relativa SISPE %,Unnamed: 16
0,,,,,,,,,,,,,,,,,
1,,,,,,,,,,,,,,,,,
2,,ENERO 1996,,4331374.97,,2421863.0,,,,,,2995774.10,,,,,
3,,FEBRERO 1996,,4313594.16,,2427015.0,,5152.0,,0.212729,,2994921.55,,-852.55,,-0.028458,
4,,MARZO 1996,,4281962.85,,2406095.0,,-20920.0,,-0.861964,,2969472.46,,-25449.09,,-0.849741,
5,,ABRIL 1996,,4200277.03,,2335425.0,,-70670.0,,-2.937124,,2891579.53,,-77892.93,,-2.623124,
6,,MAYO 1996,,4107148.04,,2267851.0,,-67574.0,,-2.893435,,2813180.05,,-78399.48,,-2.711303,
7,,JUNIO 1996,,4109171.88,,2234702.0,,-33149.0,,-1.461692,,2783076.24,,-30103.81,,-1.070099,
8,,JULIO 1996,,4044353.96,,2170789.0,,-63913.0,,-2.860023,,2712456.46,,-70619.78,,-2.537472,
9,,AGOSTO 1996,,3992379.04,,2143783.0,,-27006.0,,-1.244064,,2677610.39,,-34846.07,,-1.284668,


## 2. Selecting the values

We are going to get the values since 1998.

In [9]:
unemployment_1998_2001 = unemployment_1998_2001['Paro registrado simulación SISPE'][28:]
unemployment_1998_2001

28    2577189.08
29    2553346.30
30    2524004.47
31    2442802.82
32    2369888.96
33    2332015.92
34    2250015.46
35    2235295.79
36    2256328.38
37    2267925.35
38    2270240.34
39    2233258.58
40           NaN
41    2251727.66
42    2233082.00
43    2202336.82
44    2144375.01
45    2076942.09
46    2045949.88
47    1980957.45
48    1980966.44
49    2002551.79
50    2019968.24
51    2055809.26
52    2027981.52
53           NaN
54    2090034.89
55    2079751.87
56    2042331.52
57    1983994.21
58    1929028.39
59    1903052.57
60    1892682.70
61    1888385.66
62    1908119.88
63    1934129.89
64    1962793.80
65    1947242.34
66           NaN
67    2017389.35
Name: Paro registrado simulación SISPE, dtype: float64

Now we got all the desired data of the file in 'unemployment_1998_2001'.

## 3. Shaping

We are going to use dropna function to drop the NaN values. That is a essential step for us to get the desired shape of the data distribution.

In [10]:
unemployment_1998_2001 = unemployment_1998_2001.dropna()

And the following is the resulting data frame with a date type index and a proper nomed columns.

In [11]:
#Create the dataframe
unemployment_1998_2001 = pd.DataFrame (data = unemployment_1998_2001)
#CREATE A DATE TYPE INDEX
unemployment_1998_2001 ['Date'] = pd.date_range(start = '1998-01', end = '2001-02', freq='M')
unemployment_1998_2001 = unemployment_1998_2001.set_index('Date')
#rename the column
unemployment_1998_2001 = unemployment_1998_2001.rename(columns = {'Paro registrado simulación SISPE':'N_of_unemployed'})
unemployment_1998_2001 = unemployment_1998_2001.astype(float)
unemployment_1998_2001

Unnamed: 0_level_0,N_of_unemployed
Date,Unnamed: 1_level_1
1998-01-31,2577189.08
1998-02-28,2553346.3
1998-03-31,2524004.47
1998-04-30,2442802.82
1998-05-31,2369888.96
1998-06-30,2332015.92
1998-07-31,2250015.46
1998-08-31,2235295.79
1998-09-30,2256328.38
1998-10-31,2267925.35


Just in case, we are going to check if there are any non numeric values in our final data frame.

In [12]:
unemployment_1998_2001[~unemployment_1998_2001.applymap(np.isreal).all(1)]

Unnamed: 0_level_0,N_of_unemployed
Date,Unnamed: 1_level_1


Finally, we can save our data frame as csv in case we need it later.

In [13]:
#unemployment_1998_2001.to_csv(working_directory + '.\\Data\\Monthly_unemployment_Spain_1998-2001.csv')

# Monthly unemployment in Spain 2001 - 2005

## 1. Loading the data

In [14]:
unemployment_2001_2005 = pd.read_excel (working_directory + 
                                        '/Data/Unemployment_monthly_series/1996-2005/sispe_TOTAL_NACIONAL_2001-05.xls', 
                                        header = 7, sheet_name= 'cuadro1')
unemployment_2001_2005

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,Unnamed: 2,DEMANDAS SISPE,Unnamed: 4,PARO,Unnamed: 6,DEMANDANTES NO OCUPADOS,Unnamed: 8,PARADOS SP,Unnamed: 10,DENOS SISPE,Unnamed: 12
0,,,,,,,,,,,,,
1,,FEBRERO 2001,,2985358.67,,1598920.0,,2108671.0,,1993273.96,,2444303.66,
2,,MARZO 2001,,2997960.68,,1578456.0,,2097578.0,,1981006.25,,2418384.3,
3,,ABRIL 2001,,2957761.25,,1535090.0,,2056106.0,,1910452.95,,2334747.86,
4,,MAYO 2001,,2907423.02,,1478133.0,,2000006.0,,1898285.03,,2330949.51,
5,,JUNIO 2001,,2938734.97,,1460586.0,,2007788.0,,1842556.35,,2307459.37,
6,,JULIO 2001,,2954293.1,,1451469.0,,2002574.0,,1835737.61,,2304610.12,
7,,AGOSTO 2001,,2935409.94,,1459007.0,,2003141.0,,1878512.69,,2341626.66,
8,,SEPTIEMBRE 2001,,2984530.45,,1488551.0,,2027522.0,,1889184.82,,2346454.04,
9,,OCTUBRE 2001,,3003097.8,,1540003.0,,2079429.0,,1940909.14,,2395754.11,


## 2. Selecting the values

We are going to get the desired values from 2001 to 2005.

In [15]:
unemployment_2001_2005 = unemployment_2001_2005['PARADOS SP'][1:]
unemployment_2001_2005

1     1993273.96
2     1981006.25
3     1910452.95
4     1898285.03
5     1842556.35
6     1835737.61
7     1878512.69
8     1889184.82
9     1940909.14
10    1985857.40
11    1988715.32
12           NaN
13    2075022.04
14    2149907.62
15    2083103.23
16    2060070.17
17    2002923.60
18    1962962.61
19    1961851.82
20    1983981.95
21    2006785.81
22    2064512.41
23    2117143.57
24    2127018.10
25           NaN
26    2185156.24
27    2180215.72
28    2163498.36
29    2104475.07
30    2035601.11
31    2020367.11
32    1995964.19
33    2016675.30
34    2039629.98
35    2096605.89
36    2143206.29
37    2181248.13
38           NaN
39    2232560.03
40    2219299.50
41    2181546.15
42    2162405.43
43    2090701.59
44    2054112.54
45    2014217.75
46    2049639.24
47    2050513.65
48    2075810.91
49    2121089.25
50    2112715.05
51           NaN
52    2176599.00
53    2165420.00
54    2144835.00
55    2095945.00
Name: PARADOS SP, dtype: float64

Now we got all the desired data of the file in 'unemployment_2001_2005'.

## 3. Shaping

In [16]:
unemployment_2001_2005 = unemployment_2001_2005.dropna()

*Final data frame*

In [17]:
#Create the dataframe
unemployment_2001_2005 = pd.DataFrame (data = unemployment_2001_2005)
#CREATE A DATE TYPE INDEX
unemployment_2001_2005 ['Date'] = pd.date_range(start = '2001-02', end = '2005-05', freq='M')
unemployment_2001_2005 = unemployment_2001_2005.set_index('Date')
#rename the column
unemployment_2001_2005 = unemployment_2001_2005.rename(columns = {'PARADOS SP':'N_of_unemployed'})
unemployment_2001_2005 = unemployment_2001_2005.astype(float)
unemployment_2001_2005

Unnamed: 0_level_0,N_of_unemployed
Date,Unnamed: 1_level_1
2001-02-28,1993273.96
2001-03-31,1981006.25
2001-04-30,1910452.95
2001-05-31,1898285.03
2001-06-30,1842556.35
2001-07-31,1835737.61
2001-08-31,1878512.69
2001-09-30,1889184.82
2001-10-31,1940909.14
2001-11-30,1985857.4


In [18]:
unemployment_2001_2005[~unemployment_2001_2005.applymap(np.isreal).all(1)]

Unnamed: 0_level_0,N_of_unemployed
Date,Unnamed: 1_level_1


*Save as csv*

In [19]:
#unemployment_2001_2005.to_csv('.//Data//Monthly_unemployment_Spain_2001_2005.csv')

# Monthly unemployment in Spain 2005 - 2007


## 1. Loading the data

For these years we have a record of the file data per month, so we have to proceed in another way.
I'm going to create a list that contains all the files to iterate over it and get the value we're looking for.

In [20]:
def list_of_xls_files (desired_files_directory):
    path = os.chdir(desired_files_directory)
    files = sorted(os.listdir(path))
    files = [x.lower() for x in files] # lower case letters to avoid order problems
#CREATE A LIST WITH ALL THE xls
#Pick out 'xls' files:
    files_xls = [f for f in files if f[-3:] == 'xls']
    return files_xls

In [21]:
unemployment_2005_2007_files = list_of_xls_files (working_directory + '/Data/Unemployment_monthly_series/2005-2007')

In [22]:
unemployment_2005_2007_files

['av_sispe_0505.xls',
 'av_sispe_0506.xls',
 'av_sispe_0507.xls',
 'av_sispe_0508.xls',
 'av_sispe_0509.xls',
 'av_sispe_0510.xls',
 'av_sispe_0511.xls',
 'av_sispe_0512.xls',
 'av_sispe_0601.xls',
 'av_sispe_0602.xls',
 'av_sispe_0603.xls',
 'av_sispe_0604.xls',
 'av_sispe_0605.xls',
 'av_sispe_0606.xls',
 'av_sispe_0607.xls',
 'av_sispe_0608.xls',
 'av_sispe_0609.xls',
 'av_sispe_0610.xls',
 'av_sispe_0611.xls',
 'av_sispe_0612.xls',
 'av_sispe_0701.xls',
 'av_sispe_0702.xls',
 'av_sispe_0703.xls',
 'av_sispe_0704.xls',
 'av_sispe_0705.xls',
 'av_sispe_0706.xls',
 'av_sispe_0707.xls',
 'av_sispe_0708.xls',
 'av_sispe_0709.xls',
 'av_sispe_0710.xls',
 'av_sispe_0711.xls',
 'av_sispe_0712.xls']

In [23]:
#Initialize empty list:
unemployed_per_month = []

Loop over the list of files applying the function that gets the necessary value of each file and add it to the empty list that we have created.

In [24]:
for filepath in unemployment_2005_2007_files:
    #read the file
    unemployment_2005_2007 = pd.read_excel (filepath, header = 6, sheet_name= 'Pag. 13')
    #get the needed value
    unemployment_2005_2007 = unemployment_2005_2007['PARADOS REGISTRADOS'][1]
    #fill the list
    unemployed_per_month += [unemployment_2005_2007]

unemployed_per_month

[2007393,
 1974860,
 1989417,
 2019110,
 2013286,
 2052861,
 2095580,
 2102937,
 2171503,
 2169277,
 2148530,
 2075676,
 2004528,
 1959754,
 1954984,
 1983677,
 1966166,
 1992836,
 2023164,
 2022873,
 2082508,
 2075275,
 2059451,
 2023124,
 1973231,
 1965869,
 1970338,
 2028296,
 2017363,
 2048577,
 2094473,
 2129547]

## 2. Shaping

*Final data frame*

In [25]:
#Create the dataframe
unemployment_2005_2007 = pd.DataFrame (data = unemployed_per_month)
unemployment_2005_2007
#Date type index
unemployment_2005_2007 ['Date'] = pd.date_range(start = '2005-05', end = '2008-01', freq='M')
unemployment_2005_2007 = unemployment_2005_2007.set_index('Date')
#Rename the column and convert to float
unemployment_2005_2007 = unemployment_2005_2007.rename(columns = {0:'N_of_unemployed'})
unemployment_2005_2007 = unemployment_2005_2007.astype(float)
unemployment_2005_2007

Unnamed: 0_level_0,N_of_unemployed
Date,Unnamed: 1_level_1
2005-05-31,2007393.0
2005-06-30,1974860.0
2005-07-31,1989417.0
2005-08-31,2019110.0
2005-09-30,2013286.0
2005-10-31,2052861.0
2005-11-30,2095580.0
2005-12-31,2102937.0
2006-01-31,2171503.0
2006-02-28,2169277.0


In [26]:
#Drop Nan values
unemployment_2005_2007 = unemployment_2005_2007.dropna()

In [27]:
#Check if there is any non numeric value
unemployment_2005_2007[~unemployment_2005_2007.applymap(np.isreal).all(1)]


Unnamed: 0_level_0,N_of_unemployed
Date,Unnamed: 1_level_1


In [28]:
#Save as csv
#unemployment_2005_2007.to_csv(working_directory +'.\\Data\\Monthly_unemployment_Spain_2005-2007.csv')

# Monthly Unemployment in Spain 2008 - 2012

## 1. Loading the data

In [29]:
unemployment_2008_2012 = pd.read_excel (working_directory + 
                                        '/Data/Unemployment_monthly_series/2008-2012/Av_sispe_0801.xls', 
                                        header = 6, sheet_name= 'Pag. 19')
unemployment_2008_2012

Unnamed: 0.1,Unnamed: 0,TOTAL,Unnamed: 2,OCUPADOS,Unnamed: 4,LIMITADA (*),Unnamed: 6,Unnamed: 7,TOTAL.1,Unnamed: 9,OTROS NO OCUPADOS,Unnamed: 11,PARADOS REGISTRADOS,Unnamed: 13
0,,,,,,,,,,,,,,
1,TOTAL,3180199,,590174,,114047,,,2475978,,214053,,2261925,
2,,,,,,,,,,,,,,
3,SEXO:,,,,,,,,,,,,,
4,HOMBRES,1312175,,250982,,66168,,,995025,,59428,,935597,
5,MUJERES,1868024,,339192,,47879,,,1480953,,154625,,1326328,
6,,,,,,,,,,,,,,
7,MENORES DE 25 AÑOS:,,,,,,,,,,,,,
8,HOMBRES,179002,,25094,,5605,,,148303,,14241,,134062,
9,MUJERES,184650,,28925,,4384,,,151341,,22896,,128445,


It seems that we have to use the same procedure as with the last group of files.

In [30]:
unemployment_2008_2012_files = list_of_xls_files (working_directory + '/Data/Unemployment_monthly_series/2008-2012')

In [31]:
#Initialize empty list:
unemployed_per_month = []

for filepath in unemployment_2008_2012_files:
    #read the file
    unemployment_2008_2012 = pd.read_excel (filepath, header = 6, sheet_name= 'Pag. 19')
    #get the needed value
    unemployment_2008_2012 = unemployment_2008_2012['PARADOS REGISTRADOS'][1]
    #fill the list
    unemployed_per_month += [unemployment_2008_2012]

unemployed_per_month

[2261925,
 2315331,
 2300975,
 2338517,
 2353575,
 2390424,
 2426916,
 2530001,
 2625368,
 2818026,
 2989269,
 3128963,
 3327801,
 3481859,
 3605402,
 3644880,
 3620139,
 3564889,
 3544095,
 3629080,
 3709447,
 3808353,
 3868946,
 3923603,
 4048493,
 4130625,
 4166613,
 4142425,
 4066202,
 3982368,
 3908578,
 3969661,
 4017763,
 4085976,
 4110294,
 4100073,
 4231003,
 4299263,
 4333669,
 4269360,
 4189659,
 4121801,
 4079742,
 4130927,
 4226744,
 4360926,
 4420462,
 4422359,
 4599829,
 4712098,
 4750867,
 4744235,
 4714122,
 4615269,
 4587455,
 4625634,
 4705279,
 4833521,
 4907817,
 4848723]

## 2. Shaping

*Final data frame*

In [32]:
#Create the dataframe
unemployment_2008_2012 = pd.DataFrame (data = unemployed_per_month)
#Date type index
unemployment_2008_2012 ['Date'] = pd.date_range(start = '2008-01', end = '2013-01', freq='M')
unemployment_2008_2012 = unemployment_2008_2012.set_index('Date')
#Rename the column
unemployment_2008_2012 = unemployment_2008_2012.rename(columns = {0:'N_of_unemployed'})
unemployment_2008_2012 = unemployment_2008_2012.astype(float)
unemployment_2008_2012

Unnamed: 0_level_0,N_of_unemployed
Date,Unnamed: 1_level_1
2008-01-31,2261925.0
2008-02-29,2315331.0
2008-03-31,2300975.0
2008-04-30,2338517.0
2008-05-31,2353575.0
2008-06-30,2390424.0
2008-07-31,2426916.0
2008-08-31,2530001.0
2008-09-30,2625368.0
2008-10-31,2818026.0


In [33]:
#Drop Nan values
unemployment_2008_2012 = unemployment_2008_2012.dropna()

In [34]:
#Check if there is any non numeric value
unemployment_2008_2012[~unemployment_2008_2012.applymap(np.isreal).all(1)]

Unnamed: 0_level_0,N_of_unemployed
Date,Unnamed: 1_level_1


In [35]:
#Save as csv
#unemployment_2008_2012.to_csv(working_directory + '.\\Data\\Monthly_unemployment_Spain_2008-2012.csv')

# Monthly Unemployment in Spain 2013 - 2017

## 1. Loading the data

In [36]:
unemployment_2013_2017 = pd.read_excel (working_directory + 
                                        '/Data/Unemployment_monthly_series/2013-2017/Av_sispe_1301.xls', 
                                        header = 5, sheet_name= 'PAG 19')
unemployment_2013_2017

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,DDAN EMPL. ESPEC.*,Unnamed: 10,TOTAL,Unnamed: 12,OTROS NO OCUPADOS /TEASS,Unnamed: 14,PARADOS REGISTRADOS
0,,,,,,,,,,,,,,,,
1,,TOTAL,,,,6471021.0,,931582.0,,253574.0,,5285865.0,,305087.0,,4980778.0
2,,SEXO:,,,,,,,,,,,,,,
3,,HOMBRES,,,,3187317.0,,437779.0,,146992.0,,2602546.0,,129716.0,,2472830.0
4,,MUJERES,,,,3283704.0,,493803.0,,106582.0,,2683319.0,,175371.0,,2507948.0
5,,,,,,,,,,,,,,,,
6,,MENORES DE 25 AÑOS:,,,,,,,,,,,,,,
7,,< 25 años,,HOMBRES,,313852.0,,26140.0,,10854.0,,276858.0,,31008.0,,245850.0
8,,,,MUJERES,,282686.0,,21410.0,,9349.0,,251927.0,,33937.0,,217990.0
9,,TOTAL,,TOTAL,,596538.0,,47550.0,,20203.0,,528785.0,,64945.0,,463840.0


In [37]:
unemployment_2013_2017_files = list_of_xls_files (working_directory + '/Data/Unemployment_monthly_series/2013-2017')

In [38]:
#Initialize empty list:
unemployed_per_month = []

for filepath in unemployment_2013_2017_files:
    #read the file
    unemployment_2013_2017 = pd.read_excel (filepath, header = 5, sheet_name= 'PAG 19')
    #get the needed value
    unemployment_2013_2017 = unemployment_2013_2017['PARADOS REGISTRADOS'][1]
    #fill the list
    unemployed_per_month += [unemployment_2013_2017]

unemployed_per_month

[4980778.0,
 5040222.0,
 5035243.0,
 4989193.0,
 4890928.0,
 4763680.0,
 4698814.0,
 4698783.0,
 4724355.0,
 4811383.0,
 4808908.0,
 4701338.0,
 4814435.0,
 4812486.0,
 4795866.0,
 4684301.0,
 4572385.0,
 4449701.0,
 4419860.0,
 4427930.0,
 4447650.0,
 4526804.0,
 4512116.0,
 4447711.0,
 4525691.0,
 4512153.0,
 4451939.0,
 4333016.0,
 4215031.0,
 4120304.0,
 4046276.0,
 4067955.0,
 4094042.0,
 4176369.0,
 4149298.0,
 4093508.0,
 4150755.0,
 4152986.0,
 4094770.0,
 4011171.0,
 3891403.0,
 3767054.0,
 3683061.0,
 3697496.0,
 3720297.0,
 3764982.0,
 3789823.0,
 3702974.0,
 3760231.0,
 3750876.0,
 3702317.0,
 3573036.0,
 3461128.0,
 3362811.0,
 3335924.0,
 3382324.0,
 3410182.0,
 3467026.0,
 3474281.0,
 3412781.0]

## 2. Shaping

*Final data frame*

In [39]:
#Create the dataframe
unemployment_2013_2017 = pd.DataFrame (data = unemployed_per_month)
#Date type index
unemployment_2013_2017 ['Date'] = pd.date_range(start = '2013-01', end = '2018-01', freq='M')
unemployment_2013_2017 = unemployment_2013_2017.set_index('Date')
#Rename the column
unemployment_2013_2017 = unemployment_2013_2017.rename(columns = {0:'N_of_unemployed'})
unemployment_2013_2017 = unemployment_2013_2017.astype(float)
unemployment_2013_2017

Unnamed: 0_level_0,N_of_unemployed
Date,Unnamed: 1_level_1
2013-01-31,4980778.0
2013-02-28,5040222.0
2013-03-31,5035243.0
2013-04-30,4989193.0
2013-05-31,4890928.0
2013-06-30,4763680.0
2013-07-31,4698814.0
2013-08-31,4698783.0
2013-09-30,4724355.0
2013-10-31,4811383.0


In [40]:
#Drop Nan values
unemployment_2013_2017 = unemployment_2013_2017.dropna()

In [41]:
#Check if there is any non numeric value
unemployment_2013_2017[~unemployment_2013_2017.applymap(np.isreal).all(1)]

Unnamed: 0_level_0,N_of_unemployed
Date,Unnamed: 1_level_1


In [42]:
#Save as csv
#unemployment_2013_2017.to_csv(working_directory + '.\\Data\\Monthly_unemployment_Spain_2013-2017.csv')

# Putting all together

In [43]:
unemployment_1998_2017 = pd.concat ([unemployment_1998_2001, unemployment_2001_2005, unemployment_2005_2007, unemployment_2008_2012, unemployment_2013_2017])

#Final csv
unemployment_1998_2017.to_csv(working_directory + '/Data/Monthly_unemployment_Spain_1998-2017.csv')


In [44]:
unemployment_1998_2017.index

DatetimeIndex(['1998-01-31', '1998-02-28', '1998-03-31', '1998-04-30',
               '1998-05-31', '1998-06-30', '1998-07-31', '1998-08-31',
               '1998-09-30', '1998-10-31',
               ...
               '2017-03-31', '2017-04-30', '2017-05-31', '2017-06-30',
               '2017-07-31', '2017-08-31', '2017-09-30', '2017-10-31',
               '2017-11-30', '2017-12-31'],
              dtype='datetime64[ns]', name='Date', length=240, freq=None)

*We now have all the desired values in the appropriate format in 'Monthly_unemployment_Spain_1998-2017.csv'*