# Search Engine Experiments

Test of different tokenization methods and Word Embedding for the search engine of the articles of Equinox by Asesoftware

## CSV of Articles

CSV Columns: “article_name”, “enumeration_in_article”, “content” 
“stringWithAllText”

In [1]:
import pandas as pd
df = pd.read_csv("articles_paragraphs.csv")
df_word = pd.read_csv("articles_paragraphs.csv")
df_stem = pd.read_csv("articles_paragraphs.csv")
df_lemma = pd.read_csv("articles_paragraphs.csv")

## Data Preprocessing and Tokenization

### Whitespace with Lemma Tokenization

In [2]:
import pandas as pd
import string
import spacy

'''
In this example, we use the Spacy library to preprocess and tokenize the text, 
lowercasing the text, removing punctuation, lemmatizing the words, and removing stopwords 
and short words. We then apply this function to each paragraph in the 'content' column of the CSV file using a for loop, 
and append the resulting list of tokens to a list of lists. The final result is a list of lists, where each 
sublist contains the tokens of each paragraph.

'''

# load spacy nlp model
nlp = spacy.load('en_core_web_sm')

# define function for pre-processing and tokenization
def preprocess_text_lemma(text):
    # lowercase
    text = text.lower()
    # remove punctuation
    text = text.translate(str.maketrans('', '', string.punctuation))
    # lemmatize
    doc = nlp(text)
    lemmatized_text = [token.lemma_ for token in doc]
    # remove stopwords and short words
    stopwords = spacy.lang.en.stop_words.STOP_WORDS
    tokens = [token for token in lemmatized_text if token not in stopwords and len(token) > 2]
    return tokens

# apply pre-processing and tokenization to the 'content' column of each row
tokenized_paragraphs_lemma = []
for paragraph in df['content']:
    tokens = preprocess_text_lemma(paragraph)
    tokenized_paragraphs_lemma.append(tokens)

# print the resulting list of lists of tokens
print(tokenized_paragraphs_lemma)


[['decade', 'transform', 'multiple', 'field', 'knowledge', 'medicine', 'transformation', 'different', 'way', 'enhance', 'medicine', 'use', 'article', 'introduce', 'help', 'discover', 'new', 'drug', 'understand', 'mystery', 'cancer', 'learn', 'billion', 'relation', 'different', 'research', 'resource'], ['time', 'help', 'human', 'research', '2007', 'adam', 'robot', 'generate', 'hypothesis', 'gene', 'code', 'critical', 'enzyme', 'catalyze', 'reaction', 'yeast', 'saccharomyce', 'cerevisiae', 'adam', 'use', 'robotic', 'test', 'prediction', 'lab', 'physically', 'researcher', 'university', 'aberystwyth', 'cambridge', 'independently', 'test', 'adamsadam', 'hypothesis', 'function', 'gene', 'new', 'accurate', 'wrong', 'example', 'multiple', 'application', 'field', 'ready', 'learn'], ['understand', 'cancer', 'discover', 'new', 'drug'], ['turn', 'drugdiscovery', 'paradigm', 'upside', 'use', 'patientdriven', 'biology', 'datum', 'derive', 'morepredictive', 'hypothesis', 'traditional', 'trialanderror

### Tokenization with Stemmer


In [3]:
import pandas as pd
import string
import spacy
from nltk.stem import SnowballStemmer

# load spacy nlp model
nlp = spacy.load('en_core_web_sm')
# load stemmer
stemmer = SnowballStemmer('english')

# define function for pre-processing and tokenization
def preprocess_text(text):
    # lowercase
    text = text.lower()
    # remove punctuation
    text = text.translate(str.maketrans('', '', string.punctuation))
    # stem
    doc = nlp(text)
    stemmed_text = [stemmer.stem(token.text) for token in doc]
    # remove stopwords and short words
    stopwords = spacy.lang.en.stop_words.STOP_WORDS
    tokens = [token for token in stemmed_text if token not in stopwords and len(token) > 2]
    return tokens

# apply pre-processing and tokenization to the 'content' column of each row
tokenized_paragraphs_stem = []
for paragraph in df['content']:
    tokens = preprocess_text(paragraph)
    tokenized_paragraphs_stem.append(tokens)

# print the resulting list of lists of tokens
print(tokenized_paragraphs_stem)


[['dure', 'decad', 'transform', 'multipl', 'field', 'knowledg', 'medicin', 'transform', 'mani', 'differ', 'way', 'enhanc', 'medicin', 'use', 'articl', 'introduc', 'help', 'discov', 'new', 'drug', 'understand', 'mysteri', 'cancer', 'learn', 'billion', 'relat', 'differ', 'research', 'resourc'], ['time', 'help', 'human', 'research', '2007', 'adam', 'robot', 'generat', 'hypothes', 'gene', 'code', 'critic', 'enzym', 'catalyz', 'reaction', 'yeast', 'saccharomyc', 'cerevisia', 'adam', 'use', 'robot', 'test', 'predict', 'lab', 'physic', 'research', 'univers', 'aberystwyth', 'cambridg', 'independ', 'test', 'adamsadam', 'hypothes', 'function', 'gene', 'new', 'accur', 'onli', 'wrong', 'onli', 'exampl', 'multipl', 'applic', 'field', 'readi', 'learn'], ['understand', 'cancer', 'discov', 'new', 'drug'], ['turn', 'drugdiscoveri', 'paradigm', 'upsid', 'use', 'patientdriven', 'biolog', 'data', 'deriv', 'morepredict', 'hypothes', 'tradit', 'trialanderror', 'approach', 'exampl', 'boston', 'berg', 'biotec

### Sin Stemmer ni lemma, solamente lapabras completas (Word-based)

In [4]:
import pandas as pd
import string
import spacy

# load spacy nlp model
nlp = spacy.load('en_core_web_sm')

# define function for pre-processing and tokenization
def preprocess_text(text):
    # lowercase
    text = text.lower()
    # remove punctuation
    text = text.translate(str.maketrans('', '', string.punctuation))
    # split into words
    words = text.split()
    # remove stopwords and short words
    stopwords = spacy.lang.en.stop_words.STOP_WORDS
    tokens = [word for word in words if word not in stopwords and len(word) > 2]
    return tokens

# apply pre-processing and tokenization to the 'content' column of each row
tokenized_paragraphs_word = []
for paragraph in df['content']:
    tokens = preprocess_text(paragraph)
    tokenized_paragraphs_word.append(tokens)

# print the resulting list of lists of tokens
print(tokenized_paragraphs_word)


[['decades', 'transformed', 'multiple', 'fields', 'knowledge', 'medicine', 'transformation', 'different', 'ways', 'enhance', 'medicine', 'article', 'introduce', 'help', 'discover', 'new', 'drugs', 'understand', 'mysteries', 'cancer', 'learn', 'billion', 'relations', 'different', 'research', 'resources'], ['time', 'helped', 'humans', 'research', '2007', 'adam', 'robot', 'generated', 'hypotheses', 'genes', 'code', 'critical', 'enzymes', 'catalyze', 'reactions', 'yeast', 'saccharomyces', 'cerevisiae', 'adam', 'robotics', 'test', 'predictions', 'lab', 'physically', 'researchers', 'universities', 'aberystwyth', 'cambridge', 'independently', 'tested', 'adamsadams', 'hypotheses', 'functions', 'genes', 'new', 'accurate', 'wrong', 'example', 'multiple', 'applications', 'field', 'ready', 'learn'], ['understanding', 'cancer', 'discovering', 'new', 'drugs'], ['turning', 'drugdiscovery', 'paradigm', 'upside', 'patientdriven', 'biology', 'data', 'derive', 'morepredictive', 'hypotheses', 'traditional

## Word Embedding

### Tokenization con lemma

In [7]:
import gensim
import numpy as np

'''
Here we train the Word2Vec model with a list of lists where each sublist is a tokenized paragraph.
After we get the word vectors per paragraph, we compute our paragraph meaning vector as the mean
of its word vectors.
'''

# Train Word2Vec model
model = gensim.models.Word2Vec(tokenized_paragraphs_lemma, window=5, min_count=1, workers=4)
model.save("paragraphModel")

# Calculate the meaning vector per paragraph
paragraph_vectors_lemma = []
for paragraph_tokens in tokenized_paragraphs_lemma:
    vectors = []
    for token in paragraph_tokens:
        if token in model.wv.key_to_index:
            vectors.append(model.wv[token])
    if len(vectors) > 0:
        paragraph_vectors_lemma.append(np.mean(vectors, axis=0))
    else:
        paragraph_vectors_lemma.append(np.zeros(model.vector_size))

print(paragraph_vectors_lemma[383])

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0.]


### Tokenization con stem

In [8]:
import gensim
import numpy as np

'''
Here we train the Word2Vec model with a list of lists where each sublist is a tokenized paragraph.
After we get the word vectors per paragraph, we compute our paragraph meaning vector as the mean
of its word vectors.
'''

# Train Word2Vec model
model = gensim.models.Word2Vec(tokenized_paragraphs_stem, window=5, min_count=1, workers=4)

# Calculate the meaning vector per paragraph
paragraph_vectors_stem = []
for paragraph_tokens in tokenized_paragraphs_stem:
    vectors = []
    for token in paragraph_tokens:
        if token in model.wv.key_to_index:
            vectors.append(model.wv[token])
    if len(vectors) > 0:
        paragraph_vectors_stem.append(np.mean(vectors, axis=0))
    else:
        paragraph_vectors_stem.append(np.zeros(model.vector_size))

print(paragraph_vectors_stem[383])

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0.]


### Tokenization con word

In [9]:
import gensim
import numpy as np

'''
Here we train the Word2Vec model with a list of lists where each sublist is a tokenized paragraph.
After we get the word vectors per paragraph, we compute our paragraph meaning vector as the mean
of its word vectors.
'''

# Train Word2Vec model
model = gensim.models.Word2Vec(tokenized_paragraphs_word, window=5, min_count=1, workers=4)

# Calculate the meaning vector per paragraph
paragraph_vectors_word = []
for paragraph_tokens in tokenized_paragraphs_word:
    vectors = []
    for token in paragraph_tokens:
        if token in model.wv.key_to_index:
            vectors.append(model.wv[token])
    if len(vectors) > 0:
        paragraph_vectors_word.append(np.mean(vectors, axis=0))
    else:
        paragraph_vectors_word.append(np.zeros(model.vector_size))

print(paragraph_vectors_word[383])

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0.]


#### Agregamos los vectores con su respectivo tokenization


In [10]:
df_lemma['vector'] = paragraph_vectors_lemma
df_word['vector'] = paragraph_vectors_word
df_stem['vector'] = paragraph_vectors_stem

## Similarity Function

In [11]:

import numpy as np
from gensim.models import KeyedVectors

def cosine_similarity_list(vectors_list, query_vector):
    #Compute the cosine similarity between the vector representation of the input and the vector representations of each sentence in the text
    similarity_scores = []
    for vector in vectors_list:
        score = query_vector.dot(vector) / (np.linalg.norm(query_vector) * np.linalg.norm(vector))
        similarity_scores.append(score)

    # Sort the sentences in descending order of their cosine similarity to the input and return the top-N most similar sentences
    n = 20
    most_similar_sentences = [[vectors_list[idx],idx] for idx in np.argsort(similarity_scores)[::-1][:n] if np.sum(vectors_list[idx]) != 0]

    return most_similar_sentences[:2]


In [12]:
cosine_similarity_list(df_lemma['vector'],df_lemma['vector'][0])

  score = query_vector.dot(vector) / (np.linalg.norm(query_vector) * np.linalg.norm(vector))


[[array([-3.0001565e-03,  6.0817013e-03,  3.0248684e-03,  2.3221853e-03,
          6.7238952e-04, -7.3403749e-03,  1.7167320e-03,  1.3370595e-02,
         -5.1106163e-03, -3.4131350e-03, -2.9921825e-03, -1.1933232e-02,
         -1.9376529e-03,  2.5431793e-03,  5.1076576e-04, -2.6100934e-03,
          1.4874681e-03, -9.6749915e-03,  1.8927505e-03, -9.7666932e-03,
          3.6802364e-04,  2.2483352e-03,  8.3191106e-03, -1.8032229e-03,
         -2.5208161e-04, -6.0063711e-04, -6.8854871e-03, -7.6380931e-03,
         -4.5862729e-03,  4.0770806e-03,  5.6194663e-03, -5.2729406e-04,
          5.5518519e-04, -4.1494835e-03, -2.2660447e-03,  5.6226244e-03,
         -1.8693224e-04, -5.8829598e-03, -2.9170315e-03, -8.5848719e-03,
          4.1973386e-03, -7.2795749e-03, -1.7844482e-03, -9.9424529e-04,
          7.6926951e-03, -5.3491485e-03, -3.7695109e-03, -3.1949903e-03,
          3.0418648e-03,  8.3284592e-03,  1.5193403e-03, -5.4000146e-03,
         -3.3501256e-03, -2.1535468e-03, -3.9270003

In [13]:
cosine_similarity_list(df_stem['vector'],df_stem['vector'][0])

  score = query_vector.dot(vector) / (np.linalg.norm(query_vector) * np.linalg.norm(vector))


[[array([-5.9623658e-03,  6.2305806e-03,  4.1812239e-03,  5.6000933e-04,
         -5.4101530e-03, -1.7727932e-02,  3.0544521e-03,  2.4840258e-02,
         -9.5516620e-03, -4.5803604e-03, -7.3736836e-03, -1.9212138e-02,
         -3.1252922e-03,  4.5068027e-03,  1.4202499e-03, -1.1181517e-02,
          1.7389309e-03, -1.8517375e-02,  3.1642024e-03, -1.9103048e-02,
          4.0290644e-03,  3.8335151e-03,  1.3441882e-02, -5.6379898e-03,
         -6.4278692e-03, -4.0387451e-03, -9.3177240e-03, -1.0881801e-02,
         -1.2722836e-02, -5.0117131e-03,  1.5334750e-02,  1.3150775e-03,
          4.7757174e-03, -1.0928935e-02, -1.4443212e-03,  1.1467726e-02,
         -2.1158212e-03, -1.3312233e-02, -9.5307045e-03, -2.1072399e-02,
         -5.7473811e-05, -1.3529041e-02, -1.9572387e-03,  7.4743462e-04,
          6.8182754e-03, -3.0653107e-03, -1.0398553e-02, -2.8556716e-03,
          5.2610869e-03,  5.1269927e-03,  3.9463481e-03, -5.9619602e-03,
          1.4696266e-03,  4.0391579e-04, -5.8728228

In [14]:
cosine_similarity_list(df_word['vector'],df_word['vector'][0])

  score = query_vector.dot(vector) / (np.linalg.norm(query_vector) * np.linalg.norm(vector))


[[array([-1.9736486e-03,  1.9758206e-03,  1.9090087e-03,  6.3272186e-05,
         -1.7143670e-03, -1.8648246e-03,  3.9950808e-04,  2.1819030e-03,
         -2.0961182e-03,  3.9362966e-04, -4.6463171e-03, -2.8381497e-03,
         -7.6613307e-04, -3.6716598e-04, -7.3207042e-04, -1.8209754e-03,
         -1.6263704e-05, -1.8377993e-03, -9.6147449e-04, -2.2483876e-03,
          1.3725298e-03, -6.7580852e-04,  1.6828479e-03,  3.3540322e-04,
         -3.6949152e-04, -7.0606329e-04, -1.6790045e-04, -2.0997408e-03,
         -2.6461895e-04, -1.1765423e-03,  1.5982522e-03,  1.6180351e-03,
          2.2736338e-03,  6.6749606e-04,  1.5835830e-03,  4.1631116e-03,
          1.9663369e-04, -2.0184615e-03, -4.1448866e-04, -3.9851614e-03,
         -5.5691227e-04, -2.8431837e-03, -5.1358185e-04, -1.1352391e-03,
          2.1385143e-03, -3.6190813e-03, -1.1265416e-03, -1.0295956e-03,
         -6.4637366e-04,  2.4719682e-04,  2.7881125e-03, -1.1221399e-03,
         -8.8718778e-04, -1.4066044e-03, -1.0350315

In [15]:
print(df_word['content'][0],'\n',df_word['content'][86],'\n',df_word['content'][737])

During the last decades, AI transformed multiple fields of knowledge; medicine is not out of this transformation. There are many different ways in which we can enhance medicine using AI. In this article, I will introduce you to some of how AI can help discover new drugs, understand the mysteries of cancer, and learn up to one billion relations between different research resources. 
 Despite their differences, natural thinking and AI can complement each other in many ways. For example, AI systems can be used to help human decision-making by providing insights and predictions based on large amounts of data. In turn, human cognition can be used to validate and refine the output of AI systems, as well as to provide context and interpret results. So next time you learn something new, take a moment to marvel at the incredible power of your brain and natural learning, and don't underestimate yourself! You are not a machine and you don't need to be. 
 AI could even recommend which products you

##### Full String

## Prompt Embedding

##### For Whitespace with Lemma tokenization

In [16]:
userPrompt = "medicine using artificial intelligence"

def preprocess_text_lemma(text):
    # lowercase
    text = text.lower()
    # remove punctuation
    text = text.translate(str.maketrans('', '', string.punctuation))
    # lemmatize
    doc = nlp(text)
    lemmatized_text = [token.lemma_ for token in doc]
    # remove stopwords and short words
    stopwords = spacy.lang.en.stop_words.STOP_WORDS
    tokens = [token for token in lemmatized_text if token not in stopwords and len(token) > 2]
    return tokens

tokenized_prompt = preprocess_text_lemma(userPrompt)
print(tokenized_prompt)

   


['medicine', 'use', 'artificial', 'intelligence']


In [17]:

promptVector = np.zeros((paragraphModel.vector_size,))
word_count = 0

for token in tokenized_prompt:
    if token in paragraphModel.wv.key_to_index:
        promptVector += paragraphModel.wv[token]
        word_count += 1

if word_count > 0:
    promptVector /= word_count
    
print(promptVector)

[-0.01607061  0.01545787  0.00776466  0.01083454  0.0028751  -0.01847648
  0.00703296  0.02710505 -0.0120265  -0.00723788 -0.00866236 -0.02951655
 -0.00187614  0.00511719  0.00023503 -0.01042503  0.00327789 -0.02274382
  0.00334835 -0.02666028  0.0035325   0.0114821   0.02047421 -0.01276879
  0.00074719  0.00116523 -0.01682056 -0.01532176 -0.01076703  0.00584799
  0.01915145 -0.00230061  0.0018643  -0.00485657 -0.00764076  0.01892312
  0.00378248 -0.01406088 -0.00639728 -0.02702168  0.00813719 -0.01798417
 -0.00242203  0.00111154  0.01802703 -0.00828405 -0.01501738 -0.00242449
  0.00316604  0.01162745  0.00871932 -0.00769801 -0.00207634 -0.00634624
 -0.0125396   0.00988593  0.01335823  0.00019921 -0.00958345  0.00057228
  0.00302988  0.00183022  0.00229691 -0.00084571 -0.01752289  0.01297649
  0.00159977  0.01219904 -0.01749782  0.0149163  -0.01317433  0.01045302
  0.01845125 -0.00503777  0.01268131  0.0124661  -0.00558862 -0.00633993
 -0.01415724  0.0081356  -0.00644995 -0.00691469 -0

### For Stemmed Tokenization

## Similarity Test

In [18]:
var=cosine_similarity_list(df_lemma['vector'],promptVector)
var[0]

  score = query_vector.dot(vector) / (np.linalg.norm(query_vector) * np.linalg.norm(vector))


[array([-1.28360977e-02,  8.89567751e-03,  3.87804816e-03,  6.01529470e-03,
         1.77774229e-03, -1.28061324e-02,  4.41849884e-03,  2.07933877e-02,
        -7.07202032e-03, -5.51227015e-03, -6.59781601e-03, -1.66020170e-02,
        -7.04924203e-03, -6.12513395e-04,  1.77324249e-03, -6.07957784e-03,
         2.44163815e-03, -1.72612164e-02,  8.66064220e-04, -1.98348202e-02,
         4.29907860e-03,  1.08559849e-02,  1.46967806e-02, -1.09759560e-02,
        -3.19354469e-03,  1.74213073e-03, -6.23127259e-03, -9.47997533e-03,
        -7.89075252e-03,  1.38939009e-03,  1.35768782e-02, -1.00741396e-03,
        -1.34902622e-03, -1.36699236e-03, -5.18203015e-03,  1.12579381e-02,
         3.13007249e-03, -1.29321357e-02, -2.54960195e-03, -1.95277371e-02,
         7.26161199e-03, -1.51912551e-02, -1.20271754e-03, -5.82972891e-04,
         1.19639821e-02, -5.05094556e-03, -8.91781319e-03, -2.65438552e-03,
         5.00931265e-03,  6.84221927e-03,  5.86616015e-03, -6.04497688e-03,
        -5.2

In [19]:
df["content"][var[1][1]]

'The problem is that it is thought that the more, the better, and this only applies to the use of artificial intelligence, not to the design of the robot. So, for example, when a robot with a complex design tries to emulate emotions with many facial expressions or large movements of its joints, no matter how complex its artificial intelligence models are, the result? The famous but unwanted uncanny valley.'

In [20]:
df_lemma

Unnamed: 0.1,Unnamed: 0,article_name,content,enumeration_in_article,file_id,vector
0,0,Enhance medicine using AI,"During the last decades, AI transformed multip...",0,0,"[-0.0030001565, 0.0060817013, 0.0030248684, 0...."
1,1,Enhance medicine using AI,The first time AI helped humans research was i...,1,0,"[-0.0040250286, 0.0035326334, 0.00047616923, 0..."
2,2,Enhance medicine using AI,From understanding cancer to discovering new d...,2,0,"[0.001019223, 0.0018933084, 0.000984444, 0.004..."
3,3,Enhance medicine using AI,AI is turning the drug-discovery paradigm upsi...,3,0,"[-0.0040087057, 0.004059168, 0.0014405518, 0.0..."
4,4,Enhance medicine using AI,Another contribution of AI to this field was m...,4,0,"[-0.0022907036, 0.004143745, 0.000416351, 0.00..."
...,...,...,...,...,...,...
880,880,Aprendizaje profundo o nadar en la orilla,» NO ES NECESARIO COMPRENDER EL FUNCIONAMIENTO...,18,23,"[-0.0068550766, 0.009261615, 0.0020293624, 0.0..."
881,881,Aprendizaje profundo o nadar en la orilla,» HAY PODER DE CÓMPUTO PARA ENTRENAR (O CÓMO P...,19,23,"[-0.0061374186, 0.0061156475, 0.0011317254, 0...."
882,882,Aprendizaje profundo o nadar en la orilla,¿ESTÁ TODO? ¡FELICITACIONES! DÉ EL SALTO Y DIS...,20,23,"[-0.0068615307, 0.0070413025, 0.002013113, 0.0..."
883,883,Aprendizaje profundo o nadar en la orilla,"Piénselo, tal vez hay otras alternativas de so...",21,23,"[-0.0029982403, 0.0047661145, 0.005103954, -0...."
