Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharing an improvement: a High Customizable Text Extractor. #59

Closed
sbihaiko opened this issue Sep 17, 2023 · 0 comments
Closed

Sharing an improvement: a High Customizable Text Extractor. #59

sbihaiko opened this issue Sep 17, 2023 · 0 comments
Assignees

Comments

@sbihaiko
Copy link

Hey Guys!

Below, you will find an attached file that facilitates the overriding of the extraction method during the customization of a new pipeline. Initially developed for personal use, I believe it might be beneficial for you as well. Here is an illustrative example:

var mbuilder = new MemoryClientBuilder();
var memory = mbuilder.Build();
var orchestrator = mbuilder.GetOrchestrator();

// Replacing the default MsWordDecoder
var textExtractor = new TextExtractionHandler("extraction", orchestrator);
textExtractor.AddExtractor(
    (pipeline, file, content, ctoken) => { 
        // return new MsWordDecoder().DocToText(content); 
        return new MyDecoder().DocToText(content);  
    },
    MimeTypes.MsWord
);

Best Regards,
Sandro Bihaiko.

TextExtractionHandler.cs.txt

@dluc dluc self-assigned this Sep 25, 2023
@dluc dluc added the enhancement New feature or request label Sep 25, 2023
@dluc dluc removed the enhancement New feature or request label Jun 5, 2024
@microsoft microsoft locked and limited conversation to collaborators Jun 5, 2024
@dluc dluc converted this issue into discussion #607 Jun 5, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants