Extracting chunks of long text as its own record #177

ch264 · 2023-05-25T17:46:15Z

I would like to break up long text pages into individual records.

I can do that in my query like so:

const queries = [
  {
    query: pageQuery,
    transformer: ({ data }) => {
       return data.allMarkdownRemark.edges.map(edge => edge.node).reduce((acc, post) => {
          const pChunks = post.rawMarkdownBody.split('##');
          
          const chunks = pChunks.map(chnk => ({
            objectID: post.id,
            headings: post.headings,
            fields: post.fields.slug,
            title: post.frontmatter.title,
            internal: post.internal,
            content: chnk
          }));
          return [...acc, ...chunks]
        }, [])
    },
    indexName: algoliaIndex,
  },
];

In the console I can see that the breaking up into individual objects works and follows Algolia's recommendation on splitting up long documents

However, when I run 'gatsby build' only the last paragraph of each page makes it into the Algolia index.

Is there a way to ensure that all split up objects from a page get into the Algolia index? I am unsure on how to troubleshoot. Is breaking up long text documents possible with this plugin?

Thanks so much for your help

The text was updated successfully, but these errors were encountered:

Haroenv · 2023-05-26T08:05:35Z

Hi, that's because you use the same objectID for every part of the chunk, you need to include the chunk index in there as well. The fixed version would be:

const queries = [
  {
    query: pageQuery,
    transformer: ({ data }) => {
       return data.allMarkdownRemark.edges.map(edge => edge.node).reduce((acc, post) => {
          const pChunks = post.rawMarkdownBody.split('##');
          
          const chunks = pChunks.map((chnk, index) => ({
            objectID: post.id + '-' + index,
            headings: post.headings,
            fields: post.fields.slug,
            title: post.frontmatter.title,
            internal: post.internal,
            content: chnk
          }));
          return [...acc, ...chunks]
        }, [])
    },
    indexName: algoliaIndex,
  },
];

ch264 · 2023-05-26T18:02:50Z

Thanks a million for your help @Haroenv. That worked!

ch264 closed this as completed May 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting chunks of long text as its own record #177

Extracting chunks of long text as its own record #177

ch264 commented May 25, 2023

Haroenv commented May 26, 2023

ch264 commented May 26, 2023

Extracting chunks of long text as its own record #177

Extracting chunks of long text as its own record #177

Comments

ch264 commented May 25, 2023

Haroenv commented May 26, 2023

ch264 commented May 26, 2023