<a href="https://colab.research.google.com/github/fjme95/python-para-la-ciencia-de-datos/blob/main/semana%202/Procesamiento_de_Lenguaje_Natural.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Procesamiento de Lenguaje Natural

Se tendrá una introducción al procesamiento de texto usando Python 3. Los pasos a seguir son

1. Convertir texto a minúsculas
2. Eliminar puntuación
3. Eliminar "stopwords"
4. Lematización o radicalización
5. Vectorización

Los puntos del 1 al 4, preprocesamiento del texto, son opcionales. Sin embargo, ayuda a tener una mejor vectorización.

La razón por la cual queremos vectorizar el texto, es para que podamos utilizarlo como entrada para los modelos de aprendizaje de máquina.

El método que usaremos para transformar el texto a una entrada para los modelos matemáticos (regresiones, pca, svm, etc.) es el de la [Bolsa de Palabras](https://en.wikipedia.org/wiki/Bag-of-words_model) (BoW por sus siglás en inglés) sin considerar [n-grams](https://en.wikipedia.org/wiki/N-gram) ni [skip-grams](https://en.wikipedia.org/wiki/N-gram#Skip-gram). En este, cada entrada del vector final representa a una única palabra.

## Dependencias del notebook



Se ocupa ```import``` para agregar las librerias que utilizaremos. Se pueden ocupar las funciones, clases y variables de cada libreria que ocupemos usando el nombre de la libreri a luego el del recurso. Por ejemplo,

```python
import numpy 
numpy.array([12, 3, 4])
``` 

También, se pueden crear "alias" de las librerias ocupando ```as``` después del ```import``` (e.g. ```import numpy as np```), esto hará más corta la llamada a los recursos de la libreria

```python
import numpy as np
np.array([12, 3, 4)
```
Sí sólo se quiere ocupar una cantidad fija de funciones de la libreria, se pueden importar con ```import ... from ...```.
```python
from numpy import array, arange
array([1, 2, 3])
arange(10)
```

Importamos las dependencias que ocuparemos

In [None]:
import pandas as pd

import re

import nltk 
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer, PorterStemmer
from nltk.tokenize import word_tokenize

from sklearn.feature_extraction.text import TfidfVectorizer

Se descargan dos recursos que utilizaremos de ```nltk```: ```stopwords``` para eliminarlas de nuestro corpus y ```wordnet``` para lematizar.

In [None]:
nltk.download('stopwords')
nltk.download('wordnet')

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.


True

In [None]:
lemmatizer = WordNetLemmatizer()
stemmer = PorterStemmer()

## Lectura de los datos

Ocuparemos la colección de [Ohsumed](https://www.mat.unical.it/OlexSuite/Datasets/SampleDataSets-about.htm) para mostrar el proceso completo del procesamiento. Esta, es un subconjunto de la base de datos de MEDLINE, que tiene literatura médica mantenida por la National Library of Medicine.

In [None]:
data = pd.read_csv('https://raw.githubusercontent.com/danieljfeller/medline-multilabel/master/data/processed/ohsumed_abstracts.csv', usecols=['doc', 'label'])
print(data.shape)
data

(13924, 2)


Unnamed: 0,label,doc
0,16,Improved outcome at 28 days of age for very l...
1,5,Chylothorax after posterior spinal instrument...
2,16,Childhood pulmonary function following hyalin...
3,8,Treatment of atelectasis of upper lung lobes....
4,21,"Decision analysis, public health policy, and ..."
...,...,...
13919,11,Results of blepharoptosis surgery with early ...
13920,11,"A century of cerebral achromatopsia, This rev..."
13921,11,Intraocular lens implantation after penetrati...
13922,11,Reproducibility of topographic measurements o...


Ocuparemos el primer documento como ejemplo de cada paso en el procesamiento.


In [None]:
ejemplo = data['doc'][0]
ejemplo

' Improved outcome at 28 days of age for very low birth weight infants treated with a single dose of a synthetic surfactant, Two identical double-blind. controlled. randomized trials were initiated to determine whether the administration of a single 5 ml/kg dose of a synthetic surfactant (Exosurf Neonatal). soon after the delivery of infants with birth weights 700 to 1350 gm. would improve rates of survival without bronchopulmonary dysplasia. Both trials were terminated before enrolling their planned sample sizes because of the availability of Exosurf under the provisions of a Treatment Investigational New Drug program. We report the combined results of these trials. Study infants were stratified according to birth weight and gender before random assignment to a treatment regimen. One hundred ninety-two infants received Exosurf and 193 received an air placebo. The study groups were similar when a variety of demographic features describing the mothers. their pregnancies. the circumstanc

## Preprocesamiento

El objetivo del preprocesamiento del texto es estandarizarlo y reducirlo lo más posible para facilitar su análisis y evitar entrenar al nuestros modelos con "texto irrelevante".


### Texto a minúsculas




El primer paso es pasar todo a minúsculas. Lo hacemos para evitar que la misma palabra tenga una entrada diferente en nuestro vector final (e.g. ```Avión``` y ```avión``` es la misma palabra, pero se tomarían como distintas para el vector).

Para convertirlo, usaremos la función ```lower```.

In [None]:
ejemplo = ejemplo.lower()
print(len(ejemplo))
ejemplo

1977


' improved outcome at 28 days of age for very low birth weight infants treated with a single dose of a synthetic surfactant, two identical double-blind. controlled. randomized trials were initiated to determine whether the administration of a single 5 ml/kg dose of a synthetic surfactant (exosurf neonatal). soon after the delivery of infants with birth weights 700 to 1350 gm. would improve rates of survival without bronchopulmonary dysplasia. both trials were terminated before enrolling their planned sample sizes because of the availability of exosurf under the provisions of a treatment investigational new drug program. we report the combined results of these trials. study infants were stratified according to birth weight and gender before random assignment to a treatment regimen. one hundred ninety-two infants received exosurf and 193 received an air placebo. the study groups were similar when a variety of demographic features describing the mothers. their pregnancies. the circumstanc

### Eliminación de puntuación, números y stopwords

Usualmente, queremos eliminar estos dos elementos del texto para limpiarlo porque no suelen aportar gran información del documento. Por ejemplo, consideremos los dos textos a continuación.



> El avión, que es grande, vuela en los cielos.

> avión grande vuela cielos

Ambas cadenas brindan la misma información y la segunda es más corta. Además, al final sólo nos interesan las palabras sin ningún orden en específico por el método de vectorización que ocuparemos (BoW).


In [None]:
# eliminamos puntuación y números
ejemplo = re.sub('[^a-z\s]+', '', ejemplo)
print(len(ejemplo))
ejemplo

1889


' improved outcome at  days of age for very low birth weight infants treated with a single dose of a synthetic surfactant two identical doubleblind controlled randomized trials were initiated to determine whether the administration of a single  mlkg dose of a synthetic surfactant exosurf neonatal soon after the delivery of infants with birth weights  to  gm would improve rates of survival without bronchopulmonary dysplasia both trials were terminated before enrolling their planned sample sizes because of the availability of exosurf under the provisions of a treatment investigational new drug program we report the combined results of these trials study infants were stratified according to birth weight and gender before random assignment to a treatment regimen one hundred ninetytwo infants received exosurf and  received an air placebo the study groups were similar when a variety of demographic features describing the mothers their pregnancies the circumstances of the births and the infan

Las ```stopwords``` que eliminaremos se pueden ver ejecutando la siguiente celda.

In [None]:
stopwords.words('english')

Para eliminar las stopwords, ocupamos [```list comprehension```](https://www.w3schools.com/python/python_lists_comprehension.asp)

In [None]:
# eliminamos las stopwords
ejemplo_list = [x for x in ejemplo.split(' ') if x not in stopwords.words('english') or len(x) == 0]
ejemplo = ' '.join(ejemplo_list)
print(len(ejemplo))
ejemplo

1483


' improved outcome  days age low birth weight infants treated single dose synthetic surfactant two identical doubleblind controlled randomized trials initiated determine whether administration single  mlkg dose synthetic surfactant exosurf neonatal soon delivery infants birth weights   gm would improve rates survival without bronchopulmonary dysplasia trials terminated enrolling planned sample sizes availability exosurf provisions treatment investigational new drug program report combined results trials study infants stratified according birth weight gender random assignment treatment regimen one hundred ninetytwo infants received exosurf  received air placebo study groups similar variety demographic features describing mothers pregnancies circumstances births infants compared exosurftreated infants required significantly less oxygen respiratory support first  days life comparison airtreated infants fewer infants exosurf group pulmonary interstitial emphysema  vs  p   exosurf group sig

### Lematización y radicalización

El objetivo de este paso es reducir aún más el tamaño de nuestra BoW transormando las palabras que se refieren a lo mismo, pero están conjugadas o pluralizadas. Por ejemplo, consideremos las palabras ```correr```, ```corres``` y ```corriendo```. Usualmente, para los modelos no brinda más información cada una de estas, simplemente crean más dimensiones en la vectorización. Lo que nos gustaría es que estás tres palabras representaran lo mismo para nuestro modelo regresándolas a su "raíz" o lema, ```corr``` y ```correr```, respectivamente.

Para esto, la lemmatización y el radicalización son buenas herramientas.

La ```lematización``` consiste en obtener en lema de las palabras considerando el tipo de palabra; mientras que el ```radicalización``` sigue reglas para eliminar la cola de las palabras y transformarlas en "raices"

In [None]:
ejemplo_lemma =  ' '.join([lemmatizer.lemmatize(x) for x in ejemplo_list])
print(len(ejemplo_lemma))
ejemplo_lemma

1447


' improved outcome  day age low birth weight infant treated single dose synthetic surfactant two identical doubleblind controlled randomized trial initiated determine whether administration single  mlkg dose synthetic surfactant exosurf neonatal soon delivery infant birth weight   gm would improve rate survival without bronchopulmonary dysplasia trial terminated enrolling planned sample size availability exosurf provision treatment investigational new drug program report combined result trial study infant stratified according birth weight gender random assignment treatment regimen one hundred ninetytwo infant received exosurf  received air placebo study group similar variety demographic feature describing mother pregnancy circumstance birth infant compared exosurftreated infant required significantly le oxygen respiratory support first  day life comparison airtreated infant fewer infant exosurf group pulmonary interstitial emphysema  v  p   exosurf group significant reduction combined 

In [None]:
ejemplo_stem =  ' '.join([stemmer.stem(x) for x in ejemplo_list])
print(len(ejemplo_stem))
ejemplo_stem

1272


' improv outcom  day age low birth weight infant treat singl dose synthet surfact two ident doubleblind control random trial initi determin whether administr singl  mlkg dose synthet surfact exosurf neonat soon deliveri infant birth weight   gm would improv rate surviv without bronchopulmonari dysplasia trial termin enrol plan sampl size avail exosurf provis treatment investig new drug program report combin result trial studi infant stratifi accord birth weight gender random assign treatment regimen one hundr ninetytwo infant receiv exosurf  receiv air placebo studi group similar varieti demograph featur describ mother pregnanc circumst birth infant compar exosurftr infant requir significantli less oxygen respiratori support first  day life comparison airtreat infant fewer infant exosurf group pulmonari interstiti emphysema  vs  p   exosurf group signific reduct combin outcom neonat death surviv bronchopulmonari dysplasia  vs  p   signific increas rate surviv without diseas  vs  p   di

#### A considerar sobre Wordnet


Consideremos los dos resultados a continuación

In [None]:
lemmatizer.lemmatize('loving')

'loving'

In [None]:
lemmatizer.lemmatize('loving', 'v')

'love'

Esto ocurre porque ```loving``` puede ser tanto un verbo como un adjetivo, por ende, su lema es distinto. Es importante brindar las POS tags al lematizador para obtener la mejor representación. Sin embargo, los resultados aún sin tomar esta consideración suelen ser buenos en la mayoría de los casos, por eso esto no se verá en esta ocasión.

La Universidad de Princeton tiene un [recurso](http://wordnetweb.princeton.edu/perl/webwn) para buscar las palabras y consideración que toma WordNet.



### Función para preprocesar

Se creara una función que recibe un texto y aplica todos los pasos del preprocesamiento


In [None]:
def preprocesar(texto):
    return ' '.join([stemmer.stem(x) for x in re.sub('[^a-z\s]+', '', texto.lower()).split(' ') if x not in stopwords.words('english') or len(x) == 0])

## Vectorización

Este paso sirve para transformar las palabras en nuestro a números que puedan ser tomados como entrada para los modelos. 

Los métodos que hay para vectorizar el texto son:

- **OneHot Encoding**: 1 si el texto esta en el documento y 0 si no.
- **Frecuency**: Conteo de las veces que aparece el texto en el documento
- **TF-IDF**: Usualmente, el conteo del texto en el documento multiplicado por el logaritmo natural del total de documentos entre los que tienen el texto.
- **Representación distribuida**: La representación del texto esta distribuida en más de una entrada de nuestro vector final.

En esta ocación, veremos los tres primeros, pero nos quedaremos con ```tf-idf``` porque suele funcionar bien en muchos casos.

Para el ejemplo, se ocuparan sólo dos documentos.

In [None]:
vectorizer = TfidfVectorizer()
tfidf_ejemplo = vectorizer.fit_transform([preprocesar(data['doc'][1]), preprocesar(data['doc'][50])])

pd.DataFrame(tfidf_ejemplo.todense(), index = [0,1], columns=vectorizer.get_feature_names())

Unnamed: 0,activ,airflow,appear,area,beta,bronchiolar,bronchopulmonari,caus,chang,chest,chylothorax,cl,clearanc,complianc,complic,concentr,concentrationtim,correl,curv,data,day,dictat,distribut,dose,drainag,dysplasia,effect,elimin,evid,five,function,fusion,girl,halftim,heart,hr,idiopath,improv,includ,infant,...,iv,manag,measur,mechan,microgramskg,min,nonop,nutrit,occur,parenter,percent,pharmacokinet,posit,posterior,preterm,provid,pulmonari,rate,receiv,receptor,resist,respiratori,respons,salbutamol,scoliosi,serum,signific,six,sixth,spinal,studi,success,system,total,treat,tube,unusu,vd,volum,yearold
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.136709,0.273418,0.0,0.0,0.0,0.136709,0.0,0.0,0.0,0.0,0.0,0.136709,0.0,0.0,0.0,0.136709,0.0,0.0,0.0,0.0,0.0,0.0,0.410128,0.136709,0.0,0.0,0.0,0.136709,0.0,0.136709,0.0,...,0.0,0.136709,0.0,0.0,0.0,0.0,0.136709,0.136709,0.136709,0.136709,0.0,0.0,0.0,0.410128,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.136709,0.0,0.0,0.0,0.0,0.410128,0.0,0.136709,0.0,0.09727,0.136709,0.136709,0.136709,0.0,0.0,0.136709
1,0.07382,0.07382,0.07382,0.07382,0.07382,0.07382,0.14764,0.07382,0.14764,0.0,0.0,0.07382,0.07382,0.07382,0.0,0.07382,0.07382,0.22146,0.07382,0.07382,0.0,0.07382,0.07382,0.07382,0.0,0.14764,0.07382,0.07382,0.07382,0.07382,0.07382,0.0,0.0,0.07382,0.07382,0.14764,0.0,0.07382,0.0,0.442921,...,0.07382,0.0,0.07382,0.07382,0.14764,0.14764,0.0,0.0,0.0,0.0,0.07382,0.07382,0.07382,0.0,0.07382,0.07382,0.07382,0.07382,0.07382,0.07382,0.07382,0.22146,0.14764,0.516741,0.0,0.07382,0.07382,0.07382,0.07382,0.0,0.07382,0.0,0.07382,0.052524,0.0,0.0,0.0,0.14764,0.07382,0.0


## Flujo completo para todos los datos


El flujo de datos a continuación muestra cómo se podría llevar a cabo el proceso completo.

In [None]:
%%time
corpus = data['doc'].apply(preprocesar)

CPU times: user 5min 17s, sys: 28.2 s, total: 5min 46s
Wall time: 5min 46s


In [None]:
#corpus

In [None]:
vect = TfidfVectorizer(min_df=.05, max_df=.95)
bow_tfidf = vect.fit_transform(corpus)
pd.DataFrame(bow_tfidf.todense(), index = data.index, columns=vect.get_feature_names())

Unnamed: 0,abnorm,activ,acut,addit,administr,affect,age,also,alter,although,among,analysi,antibodi,appear,area,arteri,assess,associ,base,blood,cancer,carcinoma,cardiac,care,case,caus,cell,chang,children,chronic,clinic,combin,common,compar,comparison,complet,complic,concentr,conclud,condit,...,six,small,specif,studi,subject,success,suggest,support,surgeri,surgic,surviv,symptom,syndrom,system,techniqu,test,therapi,three,thu,time,tissu,total,treat,treatment,trial,tumor,two,type,underw,use,valu,ventricular,week,well,wherea,whether,within,without,women,year
0,0.000000,0.0,0.0,0.000000,0.102261,0.0,0.073868,0.00000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.00000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.186388,0.0,0.065717,0.102369,0.000000,0.085409,0.0,0.08929,0.000000,...,0.000000,0.0,0.0,0.095179,0.000000,0.000000,0.000000,0.100314,0.000000,0.000000,0.374723,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.077769,0.189762,0.307108,0.0,0.063056,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.0,0.097615,0.000000,0.244016,0.0,0.000000
1,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.00000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.00000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.358551,0.0,0.00000,0.000000,...,0.000000,0.0,0.0,0.000000,0.000000,0.402099,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.356367,0.326477,0.000000,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000
2,0.286464,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.00000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.10864,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.126463,0.0,0.0,0.318293,0.000000,0.103539,0.152872,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,...,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.153671,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.103760,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000
3,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.00000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.00000,0.0,0.0,0.0,0.0,0.0,0.0,0.147307,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.213724,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,...,0.000000,0.0,0.0,0.000000,0.000000,0.439321,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.204872,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.356700,0.290124,0.000000,0.0,0.144609,0.0,0.0,0.235523,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000
4,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.102948,0.09836,0.0,0.0,0.0,0.342516,0.0,0.115855,0.000000,0.0,0.0,0.00000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.136338,...,0.000000,0.0,0.0,0.066324,0.000000,0.000000,0.189006,0.139805,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.437079,0.536748,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.094556
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13919,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.00000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.00000,0.0,0.0,0.0,0.0,0.0,0.0,0.075485,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,...,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.521493,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.087565,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.443949,0.0,0.0,0.000000,0.102495,0.000000,0.0,0.000000
13920,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.00000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.176765,0.0,0.0,0.00000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.113446,0.000000,0.0,0.000000,0.000000,0.163356,0.000000,0.0,0.00000,0.000000,...,0.000000,0.0,0.0,0.000000,0.000000,0.172153,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.000000
13921,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.00000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.0,0.0,0.00000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.287438,0.0,0.101346,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,...,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.547469,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.097243,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.134501,0.000000,0.0,0.000000
13922,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.089749,0.00000,0.0,0.0,0.0,0.000000,0.0,0.000000,0.358478,0.0,0.0,0.00000,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,...,0.000000,0.0,0.0,0.000000,0.103725,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.108968,0.0,0.000000,0.0,0.0,0.000000,0.317900,0.000000,0.0,0.000000
