Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes to FlsaModel #3435

Closed
wants to merge 5 commits into from
Closed

Fixes to FlsaModel #3435

wants to merge 5 commits into from

Conversation

ERijck
Copy link
Contributor

@ERijck ERijck commented Jan 17, 2023

Remove fuzzytm dependency and add ImportError for pyfume. Fixes points raised in #3428.

@piskvorky piskvorky added this to the Next release milestone Jan 17, 2023
@ERijck ERijck mentioned this pull request Jan 17, 2023
@osma
Copy link

osma commented Jan 20, 2023

Would it be worth declaring pyfume as an optional dependency of flsamodel? That would abstract the relationship a bit further. So basically use extras_require in setup.py like this:

    extras_require={
        "flsamodel": ["pyfume==0.2.5"],
    },

Then change the user instructions in the error message (and any related documentation) to this:

pip install gensim[flsamodel]

This would abstract away the specifics - maybe in a future release, the pyfume version requirement will be bumped, or another dependency is needed for flsamodel, but the command to install the needed dependencies remains the same and the details will be managed in setup.py just like required dependencies.

@piskvorky
Copy link
Owner

I like that idea!

@ERijck is really struggling with git though. Would you be able to gather all these changes in a single PR @osma? So we could finally merge & release a bugfix. Thanks!

@osma
Copy link

osma commented Jan 20, 2023

I can help, but I don't want to step on anyone's toes, nor take credit for other people's work. Also, learning opportunities shouldn't be wasted!

How about this: If this PR is otherwise OK from your perspective, then merge it as it is now. I will then do (today) a subsequent PR only for the optional dependency declaration. After you have merged both PRs, you can make a release. That way @ERijck gets all the due credit and my little adjustment comes on top of that.

@piskvorky
Copy link
Owner

Unfortunately it's not OK – there is a conflicting PR at #3436, attempting (but failing) another one-line change.

Learning opportunities are fine but we need to get the bugfix out ASAP. You could include @ERijck 's commits (~by way of credit), that's not a problem.

@piskvorky
Copy link
Owner

Actually, I have a bit of time today so I'll do a minor code clean-up and push a PR myself :)

@ERijck
Copy link
Contributor Author

ERijck commented Jan 20, 2023

That works too! Thank you for your understanding and patience. I will make sure to understand the git workflow better before I make my next PRs.

@osma
Copy link

osma commented Jan 20, 2023

I see, I wasn't aware of the other, conflicting PR. Well, I hope you can get it all sorted out, and if it helps, I can do a PR with the optional dependency part (though it really just adding the few lines I posted in the above comment to setup.py)

@piskvorky piskvorky changed the title Fixes flsamodel Fixes to FlsaModel Jan 20, 2023
@piskvorky
Copy link
Owner

piskvorky commented Jan 20, 2023

@ERijck do you have a reference / link to an academic paper for FlsaModel? Something online where people can read more about what it is, how it works?

The docstring says only Fuzzy topic models with methods similar to LdaModel which is not terribly illuminating.

@piskvorky
Copy link
Owner

piskvorky commented Jan 20, 2023

Similar with the parameters:

            cluster_method: str ['fcm', 'gk', 'fst-pso']
                Fuzzy clustering method.

Is there a description / tutorial somewhere for what this means? Which value to choose when and why?

@piskvorky
Copy link
Owner

piskvorky commented Jan 20, 2023

Continuing further down the rabbit hole: Am I reading this correctly (_convert_bow) that FlsaModel doesn't support streaming input, and works strictly over "list of list of strings"?

That is at odds with Gensim's core mission and interfaces.

@piskvorky
Copy link
Owner

piskvorky commented Jan 20, 2023

The code requires much more work than I anticipated. I opened #3437 which outlines what needs to be done. As it stands now, I'm -1 on keeping FlsaModel in Gensim, this should not have been merged.

@mpenkov as a stop-gap measure, I propose a minimal PR to drop the setup.py requirement + bugfix.
Then if @ERijck manages to finish the fixes in time for the next release, we'll keep Flsa. Otherwise I propose dropping the model. I know, unfortunate, but the current code quality is insufficient and would give headaches to both Gensim maintainers and users.

@piskvorky piskvorky closed this Jan 20, 2023
@ERijck ERijck deleted the fixes-flsamodel branch January 23, 2023 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants