-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add WalkMetas* functions and meta.Name field #1089
Add WalkMetas* functions and meta.Name field #1089
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1089 +/- ##
==========================================
+ Coverage 52.60% 52.64% +0.03%
==========================================
Files 107 107
Lines 9389 9411 +22
==========================================
+ Hits 4939 4954 +15
- Misses 3523 3526 +3
- Partials 927 931 +4
... and 1 file with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
/hold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me.
The catalogd project will likely make use of this function.
I was a little mislead by this. If I'm not wrong, nothing changes in terms of wirings for clients, I.e clients still call LoadReader like they have been. So it's more The catalogd project will likely benefit from this function
. Is that correct?
alpha/declcfg/load.go
Outdated
} | ||
return walkFn(nil, err) | ||
} | ||
doc = []byte(strings.NewReplacer(`\u003c`, "<", `\u003e`, ">", `\u0026`, "&").Replace(string(doc))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have some documentation for what the values for the NewReplacer
method are? I.e what exactly are we achieving with this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See some context here: #1089 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, just have a few nits/comments - I don't think any of them should block this PR from merging though.
alpha/declcfg/load.go
Outdated
} | ||
return walkFn(nil, err) | ||
} | ||
doc = []byte(strings.NewReplacer(`\u003c`, "<", `\u003e`, ">", `\u0026`, "&").Replace(string(doc))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a bytes
replacer that keeps you from having to double-allocate memory here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a bytes.NewReplacer
. There is bytes.Replace(All)
, but as far as I can tell, they only handle one substitution at a time. So I think options are:
- Roll with this and tolerate the double-allocation
- Call
bytes.Replace
in a loop for each replacement. (seems worse from an allocation standpoint since we have 3 replacements) - Roll our own bytes replacer.
- Figure out a way to not require this replacement in the first place. It's been awhile, but iirc, its because an encoder is used somewhere along the line that we don't control and therefore can't set
enc.SetEscapeHTML(false)
- Not bother with the replacement and let the unicode show up in the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mis-remembered this, sorry - I used this library in the past and found it nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One producer of these artifacts is opm render
itself. :) If we output YAML, it escapes html-hostile runes like these, so we have to replace them on ingest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that the component trying to do "user-facing YAML" should choose how to encode whatever it's outputting and call it good. Having to incur this burden in the entire back-end stack is a bit wild.
@anik120 That section of code reads the entire contents of the catalog into memory (twice actually: once into a bytes.Buffer, and a second time into declcfg.DeclarativeConfig), returns it from the function, and then the caller calls It would be way more performant if we did
Later, if/when we treat metas a bit more opaquely, we can just sync every object directly to the underlying storage without having to collect anything up at all. (e.g. something closer to this, which was the original intent behind this PR in the first place. |
These function are like WalkFS except that callers provide a function that handles a Meta object at a time, rather than an entire file's worth of FBC objects. For callers that can operate on a single meta at a time, these functions will provide significant reductions in memory usage. Signed-off-by: Joe Lanford <joe.lanford@gmail.com>
Signed-off-by: Joe Lanford <joe.lanford@gmail.com>
@@ -93,17 +100,68 @@ func (m Meta) MarshalJSON() ([]byte, error) { | |||
} | |||
|
|||
func (m *Meta) UnmarshalJSON(blob []byte) error { | |||
blob = bytereplacer.New(`\u003c`, "<", `\u003e`, ">", `\u0026`, "&").Replace(blob) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the bytes replacement here, which meant I could drop that extra unmarshal call into doc json.RawMessage
when we're deciding the reader. Between the allocation improvement of a direct bytes replacer and the removal of the other unmarshal, we shave off ~7% of the runtime of opm render
when I run it against the operatorhub catalog.
Not too shabby!
In a follow-up, I might take a look at an alternate mode of action.Render
(maybe a new function alongside Run()
?), which lets the caller use a walk function. That would save a bunch on memory and might shave off more time as well.
// TODO: return an error that includes the the full JSON message, | ||
// the offset of the error, and the error message. Let callers | ||
// decide how to format it. | ||
return errors.New(resolveUnmarshalErr(blob, err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this here because it was the only place access to the full json message existed after I removed the extra unmarshal to json.RawMessage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: everettraven, joelanford The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel For follow-ups, we have:
|
/LGTM |
e1eebae
into
operator-framework:master
* add WalkMetasFS and WalkMetasReader functions These function are like WalkFS except that callers provide a function that handles a Meta object at a time, rather than an entire file's worth of FBC objects. For callers that can operate on a single meta at a time, these functions will provide significant reductions in memory usage. Signed-off-by: Joe Lanford <joe.lanford@gmail.com> * address PR feedback Signed-off-by: Joe Lanford <joe.lanford@gmail.com> --------- Signed-off-by: Joe Lanford <joe.lanford@gmail.com>
Description of the change:
These function are like WalkFS except that callers provide a function that handles a Meta object at a time, rather than an entire file's worth of FBC objects.
This PR also adds
Name string
to the Meta struct, which is necessary information to distinguish meta objects that have the same schema and package (e.g.olm.channel
andolm.bundle
objects)Motivation for the change:
For callers that can operate on a single meta at a time, these functions will provide significant reductions in memory usage. The catalogd project will likely make use of this function.
Closes #1080
Reviewer Checklist
/docs