-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
__init__() got an unexpected keyword argument 'chunker' while using customized chunker #265
Comments
Can you try with the first level of the dict removed, so just pdf_add_config = {
"chunk_size": 200,
... Linking #251 |
|
I think you need to say |
me and my colleague found a temporary solution by having code as below, from typing import Callable, Optional
from embedchain.config.BaseConfig import BaseConfig
class ChunkerConfig(BaseConfig):
def __init__(
self,
chunk_size: Optional[int] = 4000,
chunk_overlap: Optional[int] = 200,
length_function: Optional[Callable[[str], int]] = len,
):
self.chunk_size = chunk_size
self.chunk_overlap = chunk_overlap
self.length_function = length_function
from embedchain.config.AddConfig import *
chunker = {
"chunk_size": 200,
"chunk_overlap": 20,
"length_function": len
}
cc = ChunkerConfig(**chunker)
ac = AddConfig()
ac.chunker = cc
print(ac.chunker.chunk_size)
pdf_url = 'https://www.rogers.com/cms/pdf/en/Consumer_SUG_V20.pdf' #online resources
chat_bot.add('pdf_file', pdf_url, config=ac) This seems to work but it generate the same number of chunks (total 55 for this file) no matter which chunk_size we config. |
I also tried the demo in readme, same err appears. Could you please help me?
The text was updated successfully, but these errors were encountered: