Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow easy repo management #113

Open
r0mainK opened this issue Jun 27, 2019 · 4 comments
Open

Allow easy repo management #113

r0mainK opened this issue Jun 27, 2019 · 4 comments
Labels
enhancement New feature or request triage/needs-product-input This needs input from product

Comments

@r0mainK
Copy link

r0mainK commented Jun 27, 2019

Most of the idea(s) can be found in this Slack thread, but basically, I would like to be able to manage and especially exclude repositories with ease. I know this can also be done by adding filters on superset, but as a user I want something easier.

Feature proposals

  • Add an --exclude flag on the command line. This would be especially useful when importing repos from organizations, but also when repo are stored locally, as one may just have centralized all repos but only want to analyze a part, and doesn't want or is not allowed to move them. This could take in a txt file, or just repo names
  • Have this option available as well on the UI. The reasoning is that one may have already launched sourced-ce, it took a bit time to compute metrics, and suddenly he sees he forgot to remove some repo.
  • Add an --exclude-forks boolean flag, that would exclude by default all forks. There is already another issue for this.
  • Have a setting that would enable analysis of forked repositories data only after the moment it was forked
@dpordomingo dpordomingo added the enhancement New feature or request label Jun 27, 2019
@marnovo
Copy link
Member

marnovo commented Jun 28, 2019

Brainstorming the entry points where the exclusion list could (in theory) be set:

  1. Docker compose
  2. CLI flag:
    1. Repo name(s) as args
    2. File(s) with repo name list as arg(s)
  3. Web UI

Any other?

I assume this would have to take place before/during the init, right? So probably 1 and 2 above are more likely?

@r0mainK
Copy link
Author

r0mainK commented Jun 28, 2019

  1. Docker compose: then we simply do not mount the repo(s) concerned on the volume
  2. / 3. Gitbase will have to do the work after being informed, either by dropping the data from it's database if it's already launched, or adding this excluding functionality if it is not.

I don't really see any other entry points, but think this should be doable at any point, not only before or during the init, as the functionality could prove useful during data exploration.

@dpordomingo
Copy link
Contributor

I'd say that my other answer fits here.

@smacker
Copy link
Contributor

smacker commented Jul 3, 2019

We can start with a flag for cli but according to my experience, it would be much more useful to filter out repositories from UI.

I run srcd-ce without forks on src-d organization. After it downloaded all the data I saw some strange data in the charts. I quickly identified that go-vitess repository is the reason. It's not marked as a fork on github but it is a fork. The point is: a user, just like me, would often identify what should be excluded only AFTER init.

@se7entyse7en se7entyse7en added the triage/needs-product-input This needs input from product label Oct 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage/needs-product-input This needs input from product
Projects
None yet
Development

No branches or pull requests

5 participants