Stop allocating the documents we return in RAM #4383

irevoire · 2024-02-01T14:32:55Z

Describe the bug
We allocate a lot of RAM to make search requests or document requests with a very high limit.
milli seems to handle that well, but in meilisearch, we then enter in the function that deserializes and allocates every document in RAM.
For a few thousand documents, that can easily take multiple MiB in RAM. And when people are running requests like that under a high load, it can get us OOM-killed. I think that's what is happening here (private link): https://github.com/meilisearch/meilisearch-support/issues/100

For the search:

meilisearch/meilisearch/src/search.rs

Lines 592 to 644 in ff76d8f

    
           let mut documents = Vec::new(); 
        
           let documents_iter = index.documents(&rtxn, documents_ids)?; 
        
           for ((_id, obkv), score) in documents_iter.into_iter().zip(document_scores.into_iter()) { 
        
               // First generate a document with all the displayed fields 
        
               let displayed_document = make_document(&displayed_ids, &fields_ids_map, obkv)?; 
        
               // select the attributes to retrieve 
        
               let attributes_to_retrieve = to_retrieve_ids 
        
                   .iter() 
        
                   .map(|&fid| fields_ids_map.name(fid).expect("Missing field name")); 
        
               let mut document = 
        
                   permissive_json_pointer::select_values(&displayed_document, attributes_to_retrieve); 
        
               let (matches_position, formatted) = format_fields( 
        
                   &displayed_document, 
        
                   &fields_ids_map, 
        
                   &formatter_builder, 
        
                   &formatted_options, 
        
                   query.show_matches_position, 
        
                   &displayed_ids, 
        
               )?; 
        
               if let Some(sort) = query.sort.as_ref() { 
        
                   insert_geo_distance(sort, &mut document); 
        
               } 
        
               let mut semantic_score = None; 
        
               for details in &score { 
        
                   if let ScoreDetails::Vector(score_details::Vector { 
        
                       target_vector: _, 
        
                       value_similarity: Some((_matching_vector, similarity)), 
        
                   }) = details 
        
                   { 
        
                       semantic_score = Some(*similarity); 
        
                       break; 
        
                   } 
        
               } 
        
               let ranking_score = 
        
                   query.show_ranking_score.then(|| ScoreDetails::global_score(score.iter())); 
        
               let ranking_score_details = 
        
                   query.show_ranking_score_details.then(|| ScoreDetails::to_json_map(score.iter())); 
        
               let hit = SearchHit { 
        
                   document, 
        
                   formatted, 
        
                   matches_position, 
        
                   ranking_score_details, 
        
                   ranking_score, 
        
                   semantic_score, 
        
               }; 
        
               documents.push(hit);

For the documents route:

meilisearch/meilisearch/src/routes/indexes/documents.rs

Lines 554 to 602 in ff76d8f

    
           fn retrieve_documents<S: AsRef<str>>( 
        
               index: &Index, 
        
               offset: usize, 
        
               limit: usize, 
        
               filter: Option<Value>, 
        
               attributes_to_retrieve: Option<Vec<S>>, 
        
           ) -> Result<(u64, Vec<Document>), ResponseError> { 
        
               let rtxn = index.read_txn()?; 
        
               let filter = &filter; 
        
               let filter = if let Some(filter) = filter { 
        
                   parse_filter(filter) 
        
                       .map_err(|err| ResponseError::from_msg(err.to_string(), Code::InvalidDocumentFilter))? 
        
               } else { 
        
                   None 
        
               }; 
        
               let candidates = if let Some(filter) = filter { 
        
                   filter.evaluate(&rtxn, index).map_err(|err| match err { 
        
                       milli::Error::UserError(milli::UserError::InvalidFilter(_)) => { 
        
                           ResponseError::from_msg(err.to_string(), Code::InvalidDocumentFilter) 
        
                       } 
        
                       e => e.into(), 
        
                   })? 
        
               } else { 
        
                   index.documents_ids(&rtxn)? 
        
               }; 
        
               let (it, number_of_documents) = { 
        
                   let number_of_documents = candidates.len(); 
        
                   ( 
        
                       some_documents(index, &rtxn, candidates.into_iter().skip(offset).take(limit))?, 
        
                       number_of_documents, 
        
                   ) 
        
               }; 
        
               let documents: Result<Vec<_>, ResponseError> = it 
        
                   .map(|document| { 
        
                       Ok(match &attributes_to_retrieve { 
        
                           Some(attributes_to_retrieve) => permissive_json_pointer::select_values( 
        
                               &document?, 
        
                               attributes_to_retrieve.iter().map(|s| s.as_ref()), 
        
                           ), 
        
                           None => document?, 
        
                       }) 
        
                   }) 
        
                   .collect(); 
        
               Ok((number_of_documents, documents?)) 
        
           }

And as a bonus, it should make the search slightly faster when people are getting a lot of documents.

I’ve added the spike label because since we’re not returning the results in json-lines, I don’t know how it’ll work out with serde_json.
We may need to do something very custom 😬

Expected behavior
We stream the requests to the end user without ever allocating more than one document at a time.
See https://docs.rs/actix-web/latest/actix_web/struct.HttpResponseBuilder.html#method.streaming

Meilisearch version:
all

Update after a first investigation

By limiting the maximum number of searches processed at the same time, we're limiting by a lot the RAM consumption.

Fixing this issue would make the engine faster (and able to process more search requests) but not especially avoid crashes.

The text was updated successfully, but these errors were encountered:

4544: Stream documents r=Kerollmops a=irevoire # Pull Request ## Related issue Fixes #4383 ### Perf 2M hackernews: main: Time to retrieve: 7s RAM consumption: 2+GiB stream: Time to retrieve: 4.7s RAM consumption: Too small Co-authored-by: Tamo <tamo@meilisearch.com>

irevoire · 2024-05-20T13:26:58Z

We cannot fix this issue without breaking the get documents and search routes by streaming our answer to the client.

irevoire added bug Something isn't working as expected enhancement New feature or improvement performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption CPU/RAM usage labels Feb 1, 2024

curquiza changed the title ~~Stop allocating the documents we returns in RAM~~ Stop allocating the documents we return in RAM Feb 6, 2024

irevoire added the spike A spike is where we need to do investigation before we could estimate the effort to fix label Feb 12, 2024

irevoire mentioned this issue Mar 7, 2024

Avoid search crashes #4467

Closed

3 tasks

irevoire mentioned this issue Mar 28, 2024

Stream documents #4544

Merged

irevoire removed the spike A spike is where we need to do investigation before we could estimate the effort to fix label Mar 28, 2024

meili-bors bot closed this as completed in 59ecf1c May 17, 2024

curquiza added this to the v1.9.0 milestone May 17, 2024

irevoire reopened this May 20, 2024

irevoire removed this from the v1.9.0 milestone May 20, 2024

irevoire added the v2 label May 20, 2024

curquiza added the breaking change The related changes are breaking for the users label May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop allocating the documents we return in RAM #4383

Stop allocating the documents we return in RAM #4383

irevoire commented Feb 1, 2024 •

edited

irevoire commented May 20, 2024

Stop allocating the documents we return in RAM #4383

Stop allocating the documents we return in RAM #4383

Comments

irevoire commented Feb 1, 2024 • edited

Update after a first investigation

irevoire commented May 20, 2024

irevoire commented Feb 1, 2024 •

edited