Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing custom fields from a sitemap #558

Open
coprisanu opened this issue Feb 8, 2019 · 3 comments
Open

Indexing custom fields from a sitemap #558

coprisanu opened this issue Feb 8, 2019 · 3 comments

Comments

@coprisanu
Copy link

Hi,

We have created a sitemap with custom fields. We would like to understand how to have the custom fields being added in the indexed documents.

Thank you

@essiembre
Copy link
Contributor

I cannot think of an out-of-the-box way to do so with custom fields, but I can think of a workaround if you know your Java.

You could crawl your sitemap as a start URL, and write your own ILinkExtractor. The link extractor produces links with some predefined metadata fields that will be stored with the document. You could highjack one of those fields to store your own metadata in it (e.g. the "text" attribute of the produced Link objects. Then you would use one of the manipulation options in the Importer module to split your custom metadata values into their own fields.

Not the most straightforward for sure. We can make this a feature request too.

@coprisanu
Copy link
Author

coprisanu commented Feb 11, 2019

Thank you, Pascal for your response.
A new feature request would be a better solution. I was wondering if you could say how much time would take to have the new feature.

@masterbee
Copy link

+1 to this feature request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants