Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rm autogenerated *.cpp files that shouldn't be in source control #2739

Merged
merged 2 commits into from
Feb 24, 2020

Conversation

gojomo
Copy link
Collaborator

@gojomo gojomo commented Jan 29, 2020

Per my comment on #2700, that PR was merged with autogenerated *.cpp files that shouldn't be in source-control. (Long ago, it seems we were committing those to allow install-compilations by people without cython, but more recently I believe all builds/deploys were working fine generating these on demand – and if we were consciously changing back to the old policy, the *.c files would need to also be added.)

@piskvorky @mpenkov - if that matches your understanding, this PR just removes those 3 erroneously-added files, hit the 'merge' button at will.

@piskvorky
Copy link
Owner

piskvorky commented Jan 29, 2020

Thanks. We definitely don't want to keep auto-generated files under VCS.

C++ is always a pain (not sure why this isn't plain C like all the rest) – @persiyanov @menshikh-iv what are the steps to generate those files? Is the process deterministic? Why were the files included?

I suspect there was a reason.

@menshikh-iv
Copy link
Contributor

what are the steps to generate those files?

Same as always with bdist_ext

Is the process deterministic?

Yes for same envs

Why were the files included?

See #2739 (comment), Gordon explained an reason

@piskvorky
Copy link
Owner

piskvorky commented Jan 29, 2020

Why were the files included?

See #2739 (comment), Gordon explained an reason

I don't think he did – he expressed a hypothesis.

Either way, these files were committed in error, we want the auto-generated files out.

@menshikh-iv
Copy link
Contributor

I don't think he did – he expressed a hypothesis.

He is correct.
BTW as I understand, Gensim has auto-generated files until 3.8.1 from w2v cythom implementation (for this reason, we commited cpp in corpus-file PR, just mimic to existing situation, no mistake here).

If you remove all of them for now, don't forget to update https://github.com/RaRe-Technologies/gensim/wiki/Developer-page#making-a-new-release process (CC @mpenkov), no more needed to "re-generate" files stage.

@gojomo
Copy link
Collaborator Author

gojomo commented Jan 30, 2020

I think the recent re-inclusion of *.cpp files was just an oversight, given both our automated builds & provision of wheels via official releases/installs.

AFAICT:

When I 1st started contributing, the *.c files were in source-control, and it was a little helpful when doing development on plain-Python code from a source-code checkout – there was no need for installing or re-running cython.

But it probably wasn't great project hygiene to have tens-of-thousands of lines of essentially-unreviewed autogenerated code in the source tree, when the authoritative/reviewed version of the code was *.pyx files.

From my reading of the log, it seems the c/c++ generated sources were only removed 2019-10-25 (e859c11), after any official releases. But unless there's a pressing explicit reason to include them – and I don't know any – I believe they should stay out. Asking devs working from a source checkout to install cython, & run it along with other compilers if installing from a source dir, isn't too onerous and does keep true "source" files in source control, and ephemeral interim forms that are regenerable on demand out of source control.

@mpenkov
Copy link
Collaborator

mpenkov commented Jan 30, 2020

@gojomo I think we also need to do something about https://github.com/RaRe-Technologies/gensim/blob/develop/release/cython.sh - we can now "just" remove it, right?

@gojomo
Copy link
Collaborator Author

gojomo commented Jan 30, 2020

Hmm; I've never explicitly run that cython.sh, and it looks obsolete to me, as long as a build/official-release can be completed without that step.

(I've not packaged such a release, and it looks like the choice to take .c/.cpp files out of source happened after the last official release... so it'd just be a matter of ensuring it's not necessary for the official/wheels builds destined for PyPI.)

@gojomo
Copy link
Collaborator Author

gojomo commented Jan 31, 2020

I've expanded this to also delete cython.sh, which, as it specifically adds c/cpp files to source-control, seems against our recent discussion. But I'd still like confirmation that the actual release-to-PyPI process, as last done for 3.8.1, and as envisioned for 4.0.0, doesn't rely on this script. With either – (1) confirmation that it doesn't; or (2) a strong statement that it shouldn't (so release-manager will be sure to work-around its absence) – then I'd say someone should push the 'merge' button on this PR.

@piskvorky
Copy link
Owner

@mpenkov can you please confirm and if OK, merge? Cheers.

@gojomo
Copy link
Collaborator Author

gojomo commented Feb 11, 2020

If there's doubts whether cython.sh can go, I can back that out, and merge this in the form approved by @piskvorky & @menshikh-iv.

If whoever packages up releases confirms that cython.sh isn't necessary/wise for future releases (because it returns to committing autogenerated *.c/*.cpp files to git), it can be merged as-is.

But it'd be nice to merge this in one form or the other, to get those *.cpp files out, which also simplifies related development (because git then isn't constantly fussy about unstaged/uncommitted changes to those 'tracked' files).

Copy link
Collaborator

@mpenkov mpenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in looking at this.

I don't think cython.sh is necessary. Let's leave it out. If we happen to need it when we make a new release, then it will be trivial to resurrect it.

@piskvorky
Copy link
Owner

@mpenkov if you're OK with this PR, feel free to merge.

@gojomo
Copy link
Collaborator Author

gojomo commented Feb 24, 2020

With 3 approvals, I'm pressing the 'merge' button.

@gojomo gojomo merged commit 68ec5b8 into piskvorky:develop Feb 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants