Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include gslab_make as a Git submodule #38

Closed
Etang21 opened this issue Apr 28, 2021 · 13 comments · Fixed by #39
Closed

Include gslab_make as a Git submodule #38

Etang21 opened this issue Apr 28, 2021 · 13 comments · Fixed by #39
Assignees

Comments

@Etang21
Copy link
Contributor

Etang21 commented Apr 28, 2021

In this issue, we will move to including lib/gslab_make as a Git submodule. We currently do not track which version of gslab_make is used in this template; including gslab_make as a Git submodule will allow us to consistently track the version of gslab_make which is being used, and easily pull gslab_make changes into projects.

This issues is based on this comment on incorporating fixes for gslab_make into other repositories.

@Etang21
Copy link
Contributor Author

Etang21 commented May 5, 2021

@gentzkow I think I still don't have admin access to gentzkow/template. Could I possibly have those keys to the kingdom?

@gentzkow
Copy link
Owner

gentzkow commented May 5, 2021

done!

@Etang21
Copy link
Contributor Author

Etang21 commented May 6, 2021

Thanks a bunch!

Etang21 added a commit that referenced this issue May 11, 2021
@Etang21
Copy link
Contributor Author

Etang21 commented May 11, 2021

In my branch, I've set up gslab_make as its own submodule, but using submodules somewhat alters the directory structure of the template. Concretely, our library folder now carries around the extra files from gslab_make, so it looks like:

--gslab_make
   |--README.md
   |--LICENSE.txt
   |--test
        |--[testing files]
   |--gslab_make
        |--[these are the files we care about]

It is apparently impossible to make a submodule out of only one folder in a repo, as submodules necessarily include all folders. I see a couple options here, some of which are discussed in this post and this post. These are:

(a) In all template paths which use lib/gslab_make scripts, change the path to use lib/gslab_make/gslab_make. This lets us keep extra folders in lib to a minimum.

(b) Make the submodule in a different folder, such as lib/gslab_make_full. Then create a symbolic link from lib/gslab_make to lib/gslab_make_full. This has the benefit of saving us from changing any paths, though it clutters up lib somehwat, and the symbolic link structure may be somewhat counterintuitive.

(c) Maintain our current structure without submodules.

I am still looking into (a) to see where the paths would be set, and will post an update with that information shortly.

@gentzkow
Copy link
Owner

Thanks @Etang21. Sort of annoying. What if we just changed the subdirectory name in the gslab_make repo from gslab_make to something else like source? Then we'd change paths in the template from lib/gslab_make to lib/gslab_make/source.

@Etang21
Copy link
Contributor Author

Etang21 commented May 12, 2021

@gentzkow hm, yes, that does sound better! I think this makes the gslab_make repo structure more readable as a standalone repo, too. Thanks for the great suggestion; I'll take this approach.

@Etang21
Copy link
Contributor Author

Etang21 commented May 12, 2021

Note to self: update README with instructions to make sure that users pull submodule files when cloning this repo for the first time.

@Etang21
Copy link
Contributor Author

Etang21 commented May 13, 2021

@gentzkow I've renamed the gslab_make/gslab_make folder to gslab_make/source in an issue branch and PR of the gslab_make repo, along with associated paths and references in the gslab_make repo. Note that this folder is considered a Python package. However, I'm now having second thoughts and think it might be preferable to maintain the package under its original name of gslab_make.

The reason here is that it feels a bit awkward to me to name a package as source. My thinking:

  • Imports in the gslab_make repo seem clearer with gslab_make as the package name. For example, in the tests in the gslab_make repo, our import statements could be either from gslab_make import clear_dir or from source import clear_dir. The former feels more readable to me.
  • If a user in a different repository wanted to use only the code in the gslab_make package without the surrounding folders (e.g. tests and sphinx), the package's name seems clearer as gslab_make instead of source.
  • Using an informative name for the package seems to be the suggested convention in the Python docs.

I actually think the preferred way could be to leave the gslab_make module name as is, but to change the paths in the template repository. For example, the run_all.py imports would change from

f, path, desc = imp.find_module('gslab_make', [os.path.join(ROOT, 'lib')]) 
gs = imp.load_module('gslab_make', f, path, desc)

to

f, path, desc = imp.find_module('gslab_make', [os.path.join(ROOT, 'lib', 'gslab_make')]) 
gs = imp.load_module('gslab_make', f, path, desc)

What do you think? Apologies here if I'm missing something, or if these ideas are off!

@gentzkow
Copy link
Owner

Thanks @Etang21. That makes sense and your suggestion seems fine to me. This issue is undoubtedly how we ended up with /gslab_make/gslab_make/ in the first place.

Does this mean that most Python packages on Github have /package_name/package_name/ when they are cloned?

@Etang21
Copy link
Contributor Author

Etang21 commented May 13, 2021

@gentzkow great, sounds like a plan!

And yes, that's what it looks like! For example, it seems that NumPy's repository follows this structure, as does the popular requests library.

@Etang21
Copy link
Contributor Author

Etang21 commented May 13, 2021

Interesting -- it looks like what we feared has already happened. The gslab_make code in this template's copy has already diverged from that of the gslab_make repo.

In particular, in this commit for #30, we added the check_conda_status function, which is not present in the gslab_make repository. @gentzkow, it seems to me like I should incorporate any recent changes from this repo's copy of gslab_make into the gslab_make repository. Before I proceed, though, do you foresee any concerns with that?

Apologies to bother you again on this; just trying to proceed with caution around the internal tooling! One possible concern could be that this repo's changes to gslab_make are compatible with gentzkow/template but might not be compatible with gslab/template. I haven't dug enough into the changes to be sure yet, but I will check in on that before I update the gslab_make repository.

@gentzkow
Copy link
Owner

Thanks! Definitely do migrate those changes back to gslab_make if you can. It's not currently used anywhere other than gentzkow/template. The gslab flavor uses a version of the tools from the gslab_python repository.

@Etang21
Copy link
Contributor Author

Etang21 commented May 14, 2021

@gentzkow great, sounds good! Thanks for clarifying which set of tools the gslab template uses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants