AttributeError: 'frozenset' object has no attribute 'add' #24

isiktopcu · 2022-12-28T09:16:21Z

Hi there, I've been using Zeyrek to lemmatize Turkish Tweets of len 250_000. It starts to lemmatize but after 10 minutes or so, I get this error.

AttributeError Traceback (most recent call last)
in

~\AppData\Roaming\Python\Python39\site-packages\zeyrek\morphology.py in lemmatize(self, text)
137 words = _tokenize_text(text)
138 for word in words:
--> 139 analysis = self._parse(word)
140 if len(analysis) == 0:
141 word_lemmas = [word]

~\AppData\Roaming\Python\Python39\site-packages\zeyrek\morphology.py in _parse(self, word)
94 """ Parses a word and returns SingleAnalysis result. """
95 normalized_word = _normalize(word)
---> 96 return self.analyzer.analyze(normalized_word)
97
98 def _analyze_text(self, text, verbose=False):

~\AppData\Roaming\Python\Python39\site-packages\zeyrek\rulebasedanalyzer.py in analyze(self, word)
29 paths.append(SearchPath.initial(candidate, tail))
30 # search graph.
---> 31 result_paths = self.search(paths)
32
33 # generate results from successful paths.

~\AppData\Roaming\Python\Python39\site-packages\zeyrek\rulebasedanalyzer.py in search(self, current_paths)
59 continue
60 # Creates new paths with outgoing and matching transitions.
---> 61 new_paths = self.advance(path)
62 logging.debug(f"\n--\nNew paths are: ")
63 for p in new_paths:

~\AppData\Roaming\Python\Python39\site-packages\zeyrek\rulebasedanalyzer.py in advance(self, path)
123 last_token = transition.last_template_token
124 if last_token.type_ == 'LAST_VOICED':
--> 125 attributes.add(PhoneticAttribute.ExpectsConsonant)
126 elif last_token.type_ == 'LAST_NOT_VOICED':
127 attributes.add(PhoneticAttribute.ExpectsVowel)

AttributeError: 'frozenset' object has no attribute 'add'

obulat · 2022-12-29T12:03:11Z

Thank you for reporting the issue, @isiktopcu ! The PR I just merged should fix the issue. Please feel free to re-open if you still encounter it :)

isiktopcu · 2023-02-06T19:35:36Z

FYI: It gives the error with two words : "ulemalık" and "nakliyatçılık". When it sees those words it gives this error. (I work with Turkish tweets) I exclude them from the dataset then it works just fine but just to let you know. Thank you very much.

obulat · 2023-02-09T16:23:22Z

Thank you for examples, @isiktopcu! I'll look into it.

obulat mentioned this issue Dec 29, 2022

Fix frozen set issue #25

Merged

obulat closed this as completed in #25 Dec 29, 2022

obulat reopened this Feb 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: 'frozenset' object has no attribute 'add' #24

AttributeError: 'frozenset' object has no attribute 'add' #24

isiktopcu commented Dec 28, 2022

obulat commented Dec 29, 2022

isiktopcu commented Feb 6, 2023

obulat commented Feb 9, 2023

AttributeError: 'frozenset' object has no attribute 'add' #24

AttributeError: 'frozenset' object has no attribute 'add' #24

Comments

isiktopcu commented Dec 28, 2022

obulat commented Dec 29, 2022

isiktopcu commented Feb 6, 2023

obulat commented Feb 9, 2023