Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: ord() expected a character, but string of length 69 found #59

Closed
kumarsyamala opened this issue Jan 24, 2022 · 1 comment
Closed

Comments

@kumarsyamala
Copy link

kumarsyamala commented Jan 24, 2022

Can you please help to how to fix this error has I am getting it while I am creating the vector

from finbert_embedding.embedding import FinbertEmbedding
finbert = FinbertEmbedding()
sentence_embedding = finbert.sentence_vector(df['Clean_text'])( This is the Origin of the error )

Create topic model

topic_model = BERTopic(verbose=True,top_n_words=20)
topics, probs = topic_model.fit_transform(df['Clean_text'], sentence_embedding)

output:

TypeError Traceback (most recent call last)
in ()
2 from finbert_embedding.embedding import FinbertEmbedding
3 finbert = FinbertEmbedding()
----> 4 #sentence_embedding = finbert.sentence_vector(df['Clean_text'].values)
5 # Create topic model

4 frames
/usr/local/lib/python3.7/dist-packages/pytorch_pretrained_bert/tokenization.py in _clean_text(self, text)
306 output = []
307 for char in text:
--> 308 cp = ord(char)
309 if cp == 0 or cp == 0xfffd or _is_control(char):
310 continue

TypeError: ord() expected a character, but string of length 69 found

@doguaraci
Copy link
Member

Hi, this is another project called finbert_embedding. We're not affiliated in any way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants