Elasticsearch
# Buscador de documentos

## Objetivos

 - Entender la arquitectura de un buscador
 - Comprender como analizar, indizar y formular busquedas para diferentes aplicaciones 
 - Comprender como se implementa la relevancia en ES 
 - Entender como usar las opciones de relevancia para optimizar los resultados de búsqueda

## Arquitectura de un buscador

[TODO] Imagen


In [2]:
import requests

# El indice no existe
r = requests.get('http://localhost:9200/myindex')
r.json()

{u'error': u'IndexMissingException[[myindex] missing]', u'status': 404}

In [19]:
index_options = """{
    "settings" : {
        "index" : {
            "number_of_shards" : 3,
            "number_of_replicas" : 2
        }
    }
}"""



# Creamos un indice 
r = requests.put('http://localhost:9200/tvseries', data = index_options)
r.json()

{u'acknowledged': True}

In [33]:
r = requests.get('http://localhost:9200/tvseries?pretty')
print r.text

{
  "tvseries" : {
    "aliases" : { },
    "mappings" : { },
    "settings" : {
      "index" : {
        "creation_date" : "1454535155090",
        "uuid" : "NuiuQ9csQQefi8-QhdwhRA",
        "number_of_replicas" : "2",
        "number_of_shards" : "3",
        "version" : {
          "created" : "1070499"
        }
      }
    },
    "warmers" : { }
  }
}



In [29]:
r = requests.get('http://localhost:9200/tvseries/_aliases?pretty')
print r.text

{
  "tvseries" : {
    "aliases" : { }
  }
}



In [16]:
# Borramos un indice 
r = requests.delete('http://localhost:9200/tvseries')
r.json()

{u'acknowledged': True}

## Mappings

Es index documents automatically and efficiently

Es supports database queries (SQL selection ) + full text queries + aggregation although a different type + pagination vs limit

In databases you define the datatypes in the physycal schema - data types and constraints => CREATE TABLE

In search engines, datatypes are defined in mappings
Besides ES, do not require schema definition, it can infer schema: dynamic mapping => Good for development
However, it is adviable to define schemas for search performance


In [35]:
r = requests.get('http://localhost:9200/megacorp/_mappings?pretty')
print r.text

{
  "megacorp" : {
    "mappings" : {
      "employee" : {
        "properties" : {
          "about" : {
            "type" : "string"
          },
          "age" : {
            "type" : "long"
          },
          "first_name" : {
            "type" : "string"
          },
          "interests" : {
            "type" : "string"
          },
          "last_name" : {
            "type" : "string"
          }
        }
      }
    }
  }
}



In [67]:
r = requests.get('http://localhost:9200/tvseries/_mappings?pretty')
print r.text

{
  "tvseries" : {
    "mappings" : {
      "serie" : {
        "properties" : {
          "_links" : {
            "properties" : {
              "previousepisode" : {
                "properties" : {
                  "href" : {
                    "type" : "string"
                  }
                }
              },
              "self" : {
                "properties" : {
                  "href" : {
                    "type" : "string"
                  }
                }
              }
            }
          },
          "externals" : {
            "properties" : {
              "imdb" : {
                "type" : "string"
              },
              "thetvdb" : {
                "type" : "long"
              },
              "tvrage" : {
                "type" : "long"
              }
            }
          },
          "genres" : {
            "type" : "string"
          },
          "id" : {
            "type" : "long"
          },
          "image" : {
            

In [58]:
breaking_bad = requests.get('http://api.tvmaze.com/singlesearch/shows?q=breaking-bad')

breaking_bad.text


u'{"id":169,"url":"http://www.tvmaze.com/shows/169/breaking-bad","name":"Breaking Bad","type":"Scripted","language":"English","genres":["Drama","Crime","Thriller"],"status":"Ended","runtime":60,"premiered":"2008-01-20","schedule":{"time":"22:00","days":["Sunday"]},"rating":{"average":9.3},"weight":2,"network":{"id":20,"name":"AMC","country":{"name":"United States","code":"US","timezone":"America/New_York"}},"webChannel":null,"externals":{"tvrage":18164,"thetvdb":81189,"imdb":"tt0903747"},"image":{"medium":"http://tvmazecdn.com/uploads/images/medium_portrait/0/2400.jpg","original":"http://tvmazecdn.com/uploads/images/original_untouched/0/2400.jpg"},"summary":"<p><em><strong>\\"Breaking Bad\\"</strong></em> follows protagonist Walter White, a chemistry teacher who lives in New Mexico with his wife and teenage son who has cerebral palsy. White is diagnosed with Stage III cancer and given a prognosis of two years left to live. With a new sense of fearlessness based on his medical prognosis

In [48]:
r = requests.put('http://localhost:9200/tvseries/')
r.text

u'{"error":"IndexAlreadyExistsException[[tvseries] already exists]","status":400}'

In [52]:
r = requests.delete('http://localhost:9200/tvseries/')
r.text

u'{"acknowledged":true}'

In [59]:
breaking_bad.json()

{u'_links': {u'previousepisode': {u'href': u'http://api.tvmaze.com/episodes/12253'},
  u'self': {u'href': u'http://api.tvmaze.com/shows/169'}},
 u'externals': {u'imdb': u'tt0903747', u'thetvdb': 81189, u'tvrage': 18164},
 u'genres': [u'Drama', u'Crime', u'Thriller'],
 u'id': 169,
 u'image': {u'medium': u'http://tvmazecdn.com/uploads/images/medium_portrait/0/2400.jpg',
  u'original': u'http://tvmazecdn.com/uploads/images/original_untouched/0/2400.jpg'},
 u'language': u'English',
 u'name': u'Breaking Bad',
 u'network': {u'country': {u'code': u'US',
   u'name': u'United States',
   u'timezone': u'America/New_York'},
  u'id': 20,
  u'name': u'AMC'},
 u'premiered': u'2008-01-20',
 u'rating': {u'average': 9.3},
 u'runtime': 60,
 u'schedule': {u'days': [u'Sunday'], u'time': u'22:00'},
 u'status': u'Ended',
 u'summary': u'<p><em><strong>"Breaking Bad"</strong></em> follows protagonist Walter White, a chemistry teacher who lives in New Mexico with his wife and teenage son who has cerebral palsy

In [93]:
r = requests.post('http://localhost:9200/tvseries/serie', data = breaking_bad.text)
r.text

u'{"_index":"tvseries","_type":"serie","_id":"AVKpORUrbPRGM5XqNXQK","_version":1,"created":true}'

In [66]:
r = requests.get('http://localhost:9200/tvseries/serie/1?pretty')
print r.text

{
  "_index" : "tvseries",
  "_type" : "serie",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source":{"id":169,"url":"http://www.tvmaze.com/shows/169/breaking-bad","name":"Breaking Bad","type":"Scripted","language":"English","genres":["Drama","Crime","Thriller"],"status":"Ended","runtime":60,"premiered":"2008-01-20","schedule":{"time":"22:00","days":["Sunday"]},"rating":{"average":9.3},"weight":2,"network":{"id":20,"name":"AMC","country":{"name":"United States","code":"US","timezone":"America/New_York"}},"webChannel":null,"externals":{"tvrage":18164,"thetvdb":81189,"imdb":"tt0903747"},"image":{"medium":"http://tvmazecdn.com/uploads/images/medium_portrait/0/2400.jpg","original":"http://tvmazecdn.com/uploads/images/original_untouched/0/2400.jpg"},"summary":"<p><em><strong>\"Breaking Bad\"</strong></em> follows protagonist Walter White, a chemistry teacher who lives in New Mexico with his wife and teenage son who has cerebral palsy. White is diagnosed with Stage III cancer and g

In [73]:
r = requests.get('http://localhost:9200/tvseries/_search?q=genres:drama&pretty')
print r.text

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.15342641,
    "hits" : [ {
      "_index" : "tvseries",
      "_type" : "serie",
      "_id" : "1",
      "_score" : 0.15342641,
      "_source":{"id":169,"url":"http://www.tvmaze.com/shows/169/breaking-bad","name":"Breaking Bad","type":"Scripted","language":"English","genres":["Drama","Crime","Thriller"],"status":"Ended","runtime":60,"premiered":"2008-01-20","schedule":{"time":"22:00","days":["Sunday"]},"rating":{"average":9.3},"weight":2,"network":{"id":20,"name":"AMC","country":{"name":"United States","code":"US","timezone":"America/New_York"}},"webChannel":null,"externals":{"tvrage":18164,"thetvdb":81189,"imdb":"tt0903747"},"image":{"medium":"http://tvmazecdn.com/uploads/images/medium_portrait/0/2400.jpg","original":"http://tvmazecdn.com/uploads/images/original_untouched/0/2400.jpg"},"summary":"<p><em><strong>\"Breakin

In [109]:
series = ['breaking-bad','blindspot','the-knick']

for s in series:  
  data = requests.get('http://api.tvmaze.com/singlesearch/shows?q=' + s ) 
  id = data.json()['id']
  response = requests.post('http://localhost:9200/tvseries/serie/' + str(id), data = data)
  print s + " indexed: " + response.text 

breaking-bad indexed: {"_index":"tvseries","_type":"serie","_id":"169","_version":2,"created":false}
blindspot indexed: {"_index":"tvseries","_type":"serie","_id":"1855","_version":2,"created":false}
the-knick indexed: {"_index":"tvseries","_type":"serie","_id":"11498","_version":2,"created":false}


In [100]:
r = requests.get('http://localhost:9200/tvseries/_search?q=status:ended&pretty')
print r.text

{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 11,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "tvseries",
      "_type" : "serie",
      "_id" : "AVKpN8mabPRGM5XqNXQE",
      "_score" : 1.0,
      "_source":{"id":169,"url":"http://www.tvmaze.com/shows/169/breaking-bad","name":"Breaking Bad","type":"Scripted","language":"English","genres":["Drama","Crime","Thriller"],"status":"Ended","runtime":60,"premiered":"2008-01-20","schedule":{"time":"22:00","days":["Sunday"]},"rating":{"average":9.3},"weight":2,"network":{"id":20,"name":"AMC","country":{"name":"United States","code":"US","timezone":"America/New_York"}},"webChannel":null,"externals":{"tvrage":18164,"thetvdb":81189,"imdb":"tt0903747"},"image":{"medium":"http://tvmazecdn.com/uploads/images/medium_portrait/0/2400.jpg","original":"http://tvmazecdn.com/uploads/images/original_untouched/0/2400.jpg"},"summary":"<p><em><strong>\"

## Tipos de datos


   * Exact values 
     * String: string
     * Whole number: byte, short, integer, long
     * Floating-point: float, double
     * Boolean: boolean
     * Date: date  / format 


   * Full Text
      * Index
      * Analyze

   * Complex types
      * Null values
      * Arrays
      * Objects 
      * Nested

## Analizadores

In [None]:
### Whitespace Tokenizer 

[TODO] test a tokenizer
[TODO] Define a tokenizer 


In [17]:
summary = '''Breaking Bad follows protagonist Walter White, a chemistry teacher who lives in New Mexico with his wife 
          and teenage son who has cerebral palsy. White is diagnosed with Stage III cancer and given a prognosis of 
          two years left to live. With a new sense of fearlessness based on his medical prognosis, and a desire to secure
          his family's financial security, White chooses to enter a dangerous world of drugs and crime and ascends to power 
          in this world. The series explores how a fatal diagnosis such as White's releases a typical man from the daily 
          concerns and constraints of normal society and follows his transformation from mild family man to a kingpin 
          of the drug trade.'''

r = requests.post('http://localhost:9200/tvseries/_analyze?field=summary&pretty' , data = summary)
print r.text

{
  "tokens" : [ {
    "token" : "breaking",
    "start_offset" : 0,
    "end_offset" : 8,
    "type" : "<ALPHANUM>",
    "position" : 1
  }, {
    "token" : "bad",
    "start_offset" : 9,
    "end_offset" : 12,
    "type" : "<ALPHANUM>",
    "position" : 2
  }, {
    "token" : "follows",
    "start_offset" : 13,
    "end_offset" : 20,
    "type" : "<ALPHANUM>",
    "position" : 3
  }, {
    "token" : "protagonist",
    "start_offset" : 21,
    "end_offset" : 32,
    "type" : "<ALPHANUM>",
    "position" : 4
  }, {
    "token" : "walter",
    "start_offset" : 33,
    "end_offset" : 39,
    "type" : "<ALPHANUM>",
    "position" : 5
  }, {
    "token" : "white",
    "start_offset" : 40,
    "end_offset" : 45,
    "type" : "<ALPHANUM>",
    "position" : 6
  }, {
    "token" : "a",
    "start_offset" : 47,
    "end_offset" : 48,
    "type" : "<ALPHANUM>",
    "position" : 7
  }, {
    "token" : "chemistry",
    "start_offset" : 49,
    "end_offset" : 58,
    "type" : "<ALPHANUM>",
    "pos

In [23]:
name = '''Breaking Bad'''

r = requests.get('http://localhost:9200/tvseries/_analyze?field=name&pretty' , data = name)
print r.text

{
  "tokens" : [ {
    "token" : "breaking",
    "start_offset" : 0,
    "end_offset" : 8,
    "type" : "<ALPHANUM>",
    "position" : 1
  }, {
    "token" : "bad",
    "start_offset" : 9,
    "end_offset" : 12,
    "type" : "<ALPHANUM>",
    "position" : 2
  } ]
}



In [51]:
r = requests.get('http://localhost:9200/tvseries/_mappings?pretty')
print r.text

{
  "tvseries" : {
    "mappings" : {
      "tweet" : {
        "properties" : {
          "message" : {
            "type" : "string",
            "store" : true
          }
        }
      },
      "serie" : {
        "properties" : {
          "_links" : {
            "properties" : {
              "nextepisode" : {
                "properties" : {
                  "href" : {
                    "type" : "string"
                  }
                }
              },
              "previousepisode" : {
                "properties" : {
                  "href" : {
                    "type" : "string"
                  }
                }
              },
              "self" : {
                "properties" : {
                  "href" : {
                    "type" : "string"
                  }
                }
              }
            }
          },
          "externals" : {
            "properties" : {
              "imdb" : {
                "type" : "string"
             

In [73]:
r = requests.delete('http://localhost:9200/tvseries/_mappings/tweet?pretty')
print r.text

{
  "acknowledged" : true
}



In [74]:
mapping = '''{
    "tweet" : {
        "properties" : {
            "message" : {"type" : "string", "store" : true, "analyzer": "keyword" }
        }
    }
}'''

In [75]:
r = requests.put('http://localhost:9200/tvseries/_mappings/tweet?pretty' , data = mapping)

print r.text

{
  "acknowledged" : true
}



Cuando tratamos de actualizar un mapping que ya existe nos ignora

In [55]:
mapping = '''{
    "serie" : {
        "properties" : {
            "message" : {"type" : "string", "store" : true }
        }
    }
}'''

r = requests.put('http://localhost:9200/tvseries/_mappings/serie?pretty' , data = mapping)

print r.text

{
  "acknowledged" : true
}



In [79]:
r = requests.get('http://localhost:9200/tvseries/_mappings?pretty')
print r.text

{
  "tvseries" : {
    "mappings" : {
      "tweet" : {
        "properties" : {
          "message" : {
            "type" : "string",
            "store" : true,
            "analyzer" : "keyword"
          }
        }
      },
      "serie" : {
        "properties" : {
          "_links" : {
            "properties" : {
              "nextepisode" : {
                "properties" : {
                  "href" : {
                    "type" : "string"
                  }
                }
              },
              "previousepisode" : {
                "properties" : {
                  "href" : {
                    "type" : "string"
                  }
                }
              },
              "self" : {
                "properties" : {
                  "href" : {
                    "type" : "string"
                  }
                }
              }
            }
          },
          "externals" : {
            "properties" : {
              "imdb" : {
           

In [83]:
name = '''Breaking Bad'''

r = requests.get('http://localhost:9200/tvseries/_analyze?field=message&pretty' , data = name)
print r.text

{
  "tokens" : [ {
    "token" : "breaking",
    "start_offset" : 0,
    "end_offset" : 8,
    "type" : "<ALPHANUM>",
    "position" : 1
  }, {
    "token" : "bad",
    "start_offset" : 9,
    "end_offset" : 12,
    "type" : "<ALPHANUM>",
    "position" : 2
  } ]
}



In [87]:

        
r = requests.get('http://localhost:9200/tvseries/_analyze?analyzer=keyword&text=Breaking Bad&pretty' , data = name)
print r.text



{
  "tokens" : [ {
    "token" : "Breaking Bad",
    "start_offset" : 0,
    "end_offset" : 12,
    "type" : "word",
    "position" : 1
  } ]
}



In [88]:
r = requests.get('http://localhost:9200/tvseries/_analyze?analyzer=standard&text=Breaking Bad&pretty' , data = name)
print r.text


{
  "tokens" : [ {
    "token" : "breaking",
    "start_offset" : 0,
    "end_offset" : 8,
    "type" : "<ALPHANUM>",
    "position" : 1
  }, {
    "token" : "bad",
    "start_offset" : 9,
    "end_offset" : 12,
    "type" : "<ALPHANUM>",
    "position" : 2
  } ]
}



In [93]:
r = requests.get('http://localhost:9200/tvseries/_analyze?analyzer=stop&text=Breaking Bad&pretty' , data = name)
print r.text


{
  "tokens" : [ {
    "token" : "breaking",
    "start_offset" : 0,
    "end_offset" : 8,
    "type" : "word",
    "position" : 1
  }, {
    "token" : "bad",
    "start_offset" : 9,
    "end_offset" : 12,
    "type" : "word",
    "position" : 2
  } ]
}



In [95]:
r = requests.get('http://localhost:9200/tvseries/_analyze?analyzer=english&text=Breaking Bad&pretty' , data = name)
print r.text


{
  "tokens" : [ {
    "token" : "break",
    "start_offset" : 0,
    "end_offset" : 8,
    "type" : "<ALPHANUM>",
    "position" : 1
  }, {
    "token" : "bad",
    "start_offset" : 9,
    "end_offset" : 12,
    "type" : "<ALPHANUM>",
    "position" : 2
  } ]
}



## Estructura de un analizador

  - Filtro de caracteres (Character filters) -cero o más
    - Pasar a minúsculas 
    - Eliminar acentos y diacríticos
    - Eliminar signos de puntuación 
    - etc..
  - Tokenizador (Tokenizer) - uno 
  - Filtros de tokens (Token filters) -  cero o más  
    - Palabras de parada (Stopwords)
    - Lematizador (Stemming) 
    - Sinónimos 
    - Mapeo
    - etc.. 

## Tipos de analizadores

 * Standard analyzer
 * Simple analyzer
 * Whitespace analyzer
 * Language analyzer 


## Mappings

In [7]:
r = requests.get('http://localhost:9200/tvseries/_mappings?pretty')
print r.text

{
  "tvseries" : {
    "mappings" : {
      "serie" : {
        "properties" : {
          "_links" : {
            "properties" : {
              "nextepisode" : {
                "properties" : {
                  "href" : {
                    "type" : "string"
                  }
                }
              },
              "previousepisode" : {
                "properties" : {
                  "href" : {
                    "type" : "string"
                  }
                }
              },
              "self" : {
                "properties" : {
                  "href" : {
                    "type" : "string"
                  }
                }
              }
            }
          },
          "externals" : {
            "properties" : {
              "imdb" : {
                "type" : "string"
              },
              "thetvdb" : {
                "type" : "long"
              },
              "tvrage" : {
                "type" : "long"
              }


## Campo de búsqueda por defecto

## Múltiples mapeos para un campo

## Campos analizados vs no analizados

## Operadores de búsqueda - QueryDSL

### QueryDSL - terms

In [None]:
payload = """
{
    "query" : {
        "match" : {
            "last_name" : "Smith"
        }
    }
}
"""

r = requests.get('http://localhost:9200/megacorp/employee/_search?pretty', data = payload)

### Query DSL - búsqueda de texto completo

### Query DSL - busquedas borrosas

### Query DSL - busquedas booleanas

### Query DSL - búsqueda  de frase 

### Query DSL - Búsqueda por matching parcial

### Query DSL - Boosting

### Búsqueda en multiples campos

### Búsqueda multiindice y multicampo

### Relevancia en Elasticsearch

### Relevancia:  *Practical Scoring Function*
  - recupera documentos usando un modelo booleano 
  - asigna la relevancia usando una formula basada en ideas 
    - TF-ID
    - Modelo de espacio vectorial
    

### Relevancia por defecto


$$ rel(q,d) = qNorm_q \cdot coord_{q,d} \cdot \sum_{t \in q}{tf_{t,d} \cdot idf_t^2 \cdot boost_t \cdot norm_{t,d}}$$

 - $qNorm_q$ : factor de normalización de las consultas - ignorar
 - $coord_{q,d}$ : *coordination factor* - sube la importancia de los documentos que tienen más terminos de la consulta 
 - $boost_t$: *query boost* - Sube la importancia de un determinado término
 - $norm_{t,d}$: Factor de normalizacion del indice - tiene en cuenta la longitud del documento y opcionalmente *index boost*

## Relevancia - Default score 
 
[TODO] Query sobre las series haciendo uso de summary



In [5]:
r = requests.get('http://localhost:9200/tvseries/_search?q=New Mexico&pretty')
print r.text

{
  "took" : 50,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 11,
    "max_score" : 0.13050228,
    "hits" : [ {
      "_index" : "tvseries",
      "_type" : "serie",
      "_id" : "AVKpN8mabPRGM5XqNXQE",
      "_score" : 0.13050228,
      "_source":{"id":169,"url":"http://www.tvmaze.com/shows/169/breaking-bad","name":"Breaking Bad","type":"Scripted","language":"English","genres":["Drama","Crime","Thriller"],"status":"Ended","runtime":60,"premiered":"2008-01-20","schedule":{"time":"22:00","days":["Sunday"]},"rating":{"average":9.3},"weight":2,"network":{"id":20,"name":"AMC","country":{"name":"United States","code":"US","timezone":"America/New_York"}},"webChannel":null,"externals":{"tvrage":18164,"thetvdb":81189,"imdb":"tt0903747"},"image":{"medium":"http://tvmazecdn.com/uploads/images/medium_portrait/0/2400.jpg","original":"http://tvmazecdn.com/uploads/images/original_untouched/0/2400.jpg"},"summary":"<p>

### Explicando la relevancia 

In [None]:
r = requests.get('http://localhost:9200/tvseries/_search?q=New Mexico&explain&pretty')

In [6]:
print r.text

{
  "took" : 65,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 11,
    "max_score" : 0.13050228,
    "hits" : [ {
      "_shard" : 0,
      "_node" : "ooHa5uc_QEOIos6c8iHMaw",
      "_index" : "tvseries",
      "_type" : "serie",
      "_id" : "AVKpN8mabPRGM5XqNXQE",
      "_score" : 0.13050228,
      "_source":{"id":169,"url":"http://www.tvmaze.com/shows/169/breaking-bad","name":"Breaking Bad","type":"Scripted","language":"English","genres":["Drama","Crime","Thriller"],"status":"Ended","runtime":60,"premiered":"2008-01-20","schedule":{"time":"22:00","days":["Sunday"]},"rating":{"average":9.3},"weight":2,"network":{"id":20,"name":"AMC","country":{"name":"United States","code":"US","timezone":"America/New_York"}},"webChannel":null,"externals":{"tvrage":18164,"thetvdb":81189,"imdb":"tt0903747"},"image":{"medium":"http://tvmazecdn.com/uploads/images/medium_portrait/0/2400.jpg","original":"http://tvmazecdn.com

[TODO] Explicar bien cada uno de los parámetros

## Modelos de relevancia alternativa (texto) 

Otras medidas de relavancia para documentos
  - Okapi BM 25 
  -Se puede elegir una funcion de similitud por campo. sin embargo requiere reindexar

Medidas de similitud entre cadenas:
  - Fuzzy similarity

[todo] ¿cómo cambiar la medida de relevancia?

## Otras medidas de relevancia (estructura) 

- We can take into account other relevance measures
     - Time - recency
     - Location - proximity
     - Other numerical fields
  - Difference with databases: algorithms are adapted to sort and get top k documents. 

## Definiendo la relevancia a medida

 - function score 
 - script score


[TODO] Ejemplo para tener una fecha o una puntuacion media en cuenta 

## Relevancia multicampo

## Búsqueda multicampo ?? 

Motivation: 

  * Different uses: 
    * Match different full text queries in different fields: title and author
    * Order and bool queries impact, boosting may also be used
    
    * Tuning: 
       * dis_max - selecting the score of the best fields
       * tie_breaker
       * multi_match - helper to direct the same query to different fields
       * we can select fields by using regular expressions 
       * cross fields entity search
       
   * best fields 
   * most fields 
   * cross fields 


## Integracion con la interfaz de búsqueda