Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

processing pipeline example #381

Closed
eklem opened this issue Jun 25, 2017 · 10 comments
Closed

processing pipeline example #381

eklem opened this issue Jun 25, 2017 · 10 comments
Assignees
Labels
DOCUMENTATION Something should be explained better QUESTION Its always ok to ask a question, even if it is only loosely related to search-index

Comments

@eklem
Copy link
Collaborator

eklem commented Jun 25, 2017

See if I can get the Chinese tokenizer working in the add pipeline.

@eklem eklem self-assigned this Jun 25, 2017
@eklem eklem added the DOCUMENTATION Something should be explained better label Jun 25, 2017
@eklem
Copy link
Collaborator Author

eklem commented Jul 25, 2017

@fergiemcdowall, could you explain a little how a module would fit into the processing pipeline? It could take a URL key/value and duplicate it to also become an ID key with the same value.

@eklem
Copy link
Collaborator Author

eklem commented Jul 25, 2017

Then I'll make some examples and create some documentation.

@fergiemcdowall
Copy link
Owner

Yes, you could do something like:

fs.createReadStream('myData')
  .pipe(myAmazingProcessingStage)
  .pipe(index.feed())

@eklem
Copy link
Collaborator Author

eklem commented Jul 26, 2017

I'll test. myAmazingProcessingStage doesn't need to be written streamy, or do it? I guess I have no idea what I'm heading into =)

Espen

@fergiemcdowall
Copy link
Owner

Its not toooo ridiculous- they are just transform streams. Here are some examples: https://github.com/fergiemcdowall/docproc/tree/master/pipeline

@eklem
Copy link
Collaborator Author

eklem commented Jul 26, 2017

Thanks, I'll test based on the IngestDoc.js. It seems doable.

@eklem eklem closed this as completed Jul 26, 2017
@eklem
Copy link
Collaborator Author

eklem commented Jul 26, 2017

Or, maybe Spy.js is the easiest to base it on, @fergiemcdowall ?

@eklem eklem reopened this Jul 26, 2017
@eklem eklem closed this as completed Jul 26, 2017
@eklem eklem added the QUESTION Its always ok to ask a question, even if it is only loosely related to search-index label Jul 26, 2017
@fergiemcdowall
Copy link
Owner

Yes, and you would of course do something with doc before pushing it back into the pipeline

@eklem
Copy link
Collaborator Author

eklem commented Jul 26, 2017

Thanks. I can do this =)
I'll leave it open since its a documentation task and not just a question. I'll test and then document.

@eklem eklem reopened this Jul 26, 2017
@fergiemcdowall
Copy link
Owner

I've added this to the list: #458

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DOCUMENTATION Something should be explained better QUESTION Its always ok to ask a question, even if it is only loosely related to search-index
Projects
None yet
Development

No branches or pull requests

2 participants