-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add github_repository_content table #207
feat: add github_repository_content table #207
Conversation
c0f1fd4
to
96cfadc
Compare
github_public.github_repository_content | ||
where | ||
repository_full_name = 'github/docs' | ||
and repository_content_path = 'docs' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and repository_content_path = 'docs' | |
and repository_content_path = 'docs'; |
github_public.github_repository_content | ||
where | ||
repository_full_name = 'github/docs' | ||
and repository_content_path = '.vscode/settings.json' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and repository_content_path = '.vscode/settings.json' | |
and repository_content_path = '.vscode/settings.json'; |
{Name: "content", Description: "The decoded file content (if the element is a file).", Type: proto.ColumnType_STRING, Transform: transform.FromMethod("GetContent")}, | ||
{Name: "target", Description: "Target is only set if the type is \"symlink\" and the target is not a normal file. If Target is set, Path will be the symlink path.", Type: proto.ColumnType_STRING}, | ||
{Name: "sha", Description: "The sha of the file.", Type: proto.ColumnType_STRING, Transform: transform.FromField("SHA")}, | ||
{Name: "url", Description: "Url of file's metadata", Type: proto.ColumnType_STRING}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{Name: "url", Description: "Url of file's metadata", Type: proto.ColumnType_STRING}, | |
{Name: "url", Description: "URL of file's metadata.", Type: proto.ColumnType_STRING}, |
{Name: "git_url", Description: "Git url (with SHA) of the file", Type: proto.ColumnType_STRING}, | ||
{Name: "html_url", Description: "Raw file url in GitHub", Type: proto.ColumnType_STRING}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{Name: "git_url", Description: "Git url (with SHA) of the file", Type: proto.ColumnType_STRING}, | |
{Name: "html_url", Description: "Raw file url in GitHub", Type: proto.ColumnType_STRING}, | |
{Name: "git_url", Description: "Git URL (with SHA) of the file.", Type: proto.ColumnType_STRING}, | |
{Name: "html_url", Description: "Raw file URL in GitHub.", Type: proto.ColumnType_STRING}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aminvielledebatAtBedrock for the contribution.
I have left a few minor comments on the PR. If you have any questions on the feedback, please reach out 😄 .
96cfadc
to
f8c688a
Compare
So cool @misraved and @cbruno10 ! One last point : actually IMO, it's not the most efficient way 😢 |
I'm excited about this table ... wondering about the name ... is Generally the name of the table should represent what is in each row. Perhaps |
@e-gineer I think It gives all the contents of the target repository. @aminvielledebatAtBedrock I think using a specific Hydrate on the @cbruno10 could you please share your thoughts on this one? |
f8c688a
to
a1f91df
Compare
Hey @aminvielledebatAtBedrock , sorry for the delay in review, I'll have a look tomorrow and provide any additional feedback. |
a1f91df
to
b9244da
Compare
Hi |
Hi @gforien , @aminvielledebatAtBedrock , sorry, we're (still) working through a backlog of issues and other PR reviews (in other plugin and mod repos as well). This is definitely on our radar though and we'll address the review as soon as we can. |
b9244da
to
a201ac3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aminvielledebatAtBedrock Please see review questions and suggestions, thanks!
|
||
//// TABLE DEFINITION | ||
|
||
func tableGitHubRepositoryContent() *plugin.Table { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aminvielledebatAtBedrock I tried to use this table to list contents of a directory that contained a .png
file, and then tried a query to get the actual .png
file:
> select * from github_repository_content where repository_full_name = 'turbot/steampipe-mod-aws-insights' and repository_content_path = 'docs/images/aws-insights-console-graphic.png'
Error: rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8 (SQLSTATE HV000)
+----------------------+------+------+-------------------------+------+------+---------+--------+-----+-----+---------+----------+--------------+------+
| repository_full_name | type | name | repository_content_path | path | size | content | target | sha | url | git_url | html_url | download_url | _ctx |
+----------------------+------+------+-------------------------+------+------+---------+--------+-----+-----+---------+----------+--------------+------+
+----------------------+------+------+-------------------------+------+------+---------+--------+-----+-----+---------+----------+--------------+------+
> select * from github_repository_content where repository_full_name = 'turbot/steampipe-mod-aws-insights' and repository_content_path = 'docs/images'
Error: rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8 (SQLSTATE HV000)
+----------------------+------+------+-------------------------+------+------+---------+--------+-----+-----+---------+----------+--------------+------+
| repository_full_name | type | name | repository_content_path | path | size | content | target | sha | url | git_url | html_url | download_url | _ctx |
+----------------------+------+------+-------------------------+------+------+---------+--------+-----+-----+---------+----------+--------------+------+
+----------------------+------+------+-------------------------+------+------+---------+--------+-----+-----+---------+----------+--------------+------+
I'm not sure what the repository content API returns, but have you tested using this table when getting content for non-text files, or listing directories that contain them? For instance, does this table also work with GIF, JPEG, SVG, Microsoft Office (Word, PPT, Excel), PDF, etc., files? If so, what's in content
for them?
Also, I'm not sure if you're on a different version, but I'm on github.com/google/go-github/v48 v48.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried against an .svg
file and it seemed to return the content
OK
{Name: "name", Description: "The file name.", Type: proto.ColumnType_STRING}, | ||
{Name: "repository_content_path", Description: "The requested path in repository search.", Type: proto.ColumnType_STRING, Transform: transform.FromQual("repository_content_path")}, | ||
{Name: "path", Description: "The path of the file.", Type: proto.ColumnType_STRING}, | ||
{Name: "size", Description: "The size of the file (in MB).", Type: proto.ColumnType_INT}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aminvielledebatAtBedrock I saw that the GitHub API lists some caveats and restrictions about file size in https://docs.github.com/en/rest/repos/contents?apiVersion=2022-11-28#size-limits.
What are the query results/error if the file size is over 100 MB? Also, for files less than 1 MB and for those between 1 - 100 MB, are there any differences in the column values, or do the differences not affect the table?
|
||
func tableGitHubRepositoryContent() *plugin.Table { | ||
return &plugin.Table{ | ||
Name: "github_repository_content", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the table name (originally brought up in #207 (comment)), I think I still like the name github_repository_content
, but am open to github_repository_file
.
For github_repository_content
, it's the name that the GitHub API uses and because each row can return each file's contents, the name still seems to fit.
On the other hand, github_repository_file
would also be intuitive, as each row contains a file (giving meta information and contents).
Between the two, I don't have any strong preferences.
@e-gineer @johnsmyth @aminvielledebatAtBedrock - Curious to hear your thoughts as well, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbruno10 I also lean toward github_repository_content
if the table also returns submodules and directories in addition to files.
ShouldIgnoreError: isNotFoundError([]string{"404"}), | ||
KeyColumns: []*plugin.KeyColumn{ | ||
{Name: "repository_full_name", Require: plugin.Required}, | ||
{Name: "repository_content_path", Require: plugin.Optional, CacheMatch: "exact"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a separate column for this? Could we use path
instead as the optional key column?
List: &plugin.ListConfig{ | ||
Hydrate: tableGitHubRepositoryContentList, | ||
ShouldIgnoreError: isNotFoundError([]string{"404"}), | ||
KeyColumns: []*plugin.KeyColumn{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For key columns, are we also able to pass in the ref
, e.g., commit, branch, that the API lists?
'This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.' |
Hey @aminvielledebatAtBedrock , do you think you'll have time to look at the review suggestions soon? We're keen to get your awesome table in, so if you don't have time, we can merge your branch into a local feature branch and then go from there. Thanks! |
Hi @cbruno10 ! Sorry about the delay, i'm working on other stuff right now... but we need this table to exploit baseline from https://github.com/Yelp/detect-secrets accross our projects. So, I plan to work on it the next weeks. If it's acceptable for you, you can rely on me. |
'This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.' |
Don't stale this PR. I'm still working on ;) |
@aminvielledebatAtBedrock Blame the bot 😆 Thanks for checking up on the issue! |
@aminvielledebatAtBedrock Hey, just checking in, would you like any help with this PR? We're happy to do some additional testing for the suggestions above if so. Thanks! |
20de5e4
into
turbot:add-github-repo-content-table
Hey @aminvielledebatAtBedrock , thanks again for your work on this table! For now, I've merged it into the If you get time and are interested in working more on the table, feel free to fork the |
Hey @cbruno10 ! Sorry again for the delay. You're totally right by doing this. Thank you so much for your efforts ! |
List a repository
Query
Results
List a directory in a repository
Query
Results
Get a file in a repository
Query
Results
version: 1
links:
- name: Observability docs
description: Docs Engineering observability docs
kind: docs
service: docs-internal
url: https://github.com/github/docs-engineering/blob/main/docs/observability/docs.github.com.md
- name: Datadog dashboard
description: Production Datadog dashboard
kind: dashboard
service: docs-internal
url: https://app.datadoghq.com/dashboard/8uc-x67-tgi/help-docs
- name: Sentry dashboard
description: Production Sentry dashboard
kind: dashboard
service: docs-internal
url: https://sentry.io/organizations/github/issues/?project=5394325
- name: Help Docs playbook
description: How to respond to Help Docs Alerts
kind: playbook
service: docs-internal
url: https://github.com/github/docs-engineering/blob/main/docs/playbooks/docs/index.md
- name: Maintaining the deployment
description: Document describing the production deployment and how to troubleshoot it.
kind: docs
service: docs-internal
url: https://github.com/github/docs-internal#deployments
- name: PagerDuty escalation policy
description: Escalation policy and on-call schedule for incident response
kind: tool
service: docs-internal
url: https://github.pagerduty.com/schedules#PSTWXGP
- name: Feature docs
description: Docs describing a few key features of Help Docs
kind: docs
service: docs-internal
url: https://github.com/github/docs-engineering/tree/main/docs
- name: Technical architecture
description: This document covers the technical architecture of Help Docs
kind: docs
service: docs-internal
url: https://github.com/github/docs-internal#readmes
- name: Spin up locally
description: How to setup Help Docs on your local machine
kind: docs
service: docs-internal
url: https://github.com/github/docs-internal#developing