-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge the Report._files/_chunks
#553
base: main
Are you sure you want to change the base?
Conversation
4974cef
to
f6d9418
Compare
f6d9418
to
054211b
Compare
✅ Sentry found no issues in your recent changes ✅ |
054211b
to
7bac317
Compare
CodSpeed Performance ReportMerging #553 will degrade performances by 42.1%Comparing Summary
Benchmarks breakdown
|
f3d14c7
to
83f9302
Compare
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## main #553 +/- ##
==========================================
- Coverage 88.88% 88.83% -0.05%
==========================================
Files 461 461
Lines 13192 13134 -58
Branches 1519 1505 -14
==========================================
- Hits 11726 11668 -58
- Misses 1142 1145 +3
+ Partials 324 321 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Codecov ReportAttention: Patch coverage is ✅ All tests successful. No failed tests found.
📢 Thoughts on this report? Let us know! 🚀 New features to boost your workflow:
|
Instead of maintaining two separate fields, this merges the previous `_files` which was a dict from filename to `totals`, and the `_chunks` which was either a fully parsed `ReportFile`, or just a shallow `list` of lines. Maintaining only a single structure for the list of files makes a ton of internals of the `Report` simpler. It also solves the problem of `file_totals` getting out of sync with the `ReportFile.totals`.
83f9302
to
10dd46e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good change!
@@ -66,19 +65,16 @@ def generate_carryforward_report( | |||
Returns: | |||
EditableReport: A new report with only the info related to `flags` on it, as described above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[very nit] now it returns Report
, although they might be the same thing?
Docs also mention that it "creates a new report", but the code suggests all changes are done in-place, and the same report is returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if that could pose a problem... say we use commit.parent
report to build a "new" carryforward report for commit
.
Would we save the modified commit.parent
report back with the changes? Probably not, but if it's possible it might be a good idea to explicit that in the documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reminding me about this. Indeed this is a behavior change, and I need to update the docs, and make sure that the calling code is aware of this.
) | ||
if bind: | ||
self._chunks[_file.file_index] = report_file | ||
def get(self, filename, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that silently ignoring the bind
argument (and the _else
to some extent) could lead to unexpected behavior.
I'm sure this way it's easier to integrate with the rest of the code, but maybe it would be better to fail here if people are using those extra options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats a good point. Python is also a bit annoying with *args
vs **kwargs
as well which can cause problems.
regardless of that, I believe the bind
argument was unused before. I was fixing a bug with that recently as it was completely broken otherwise. So I’m pretty sure noone was using that argument.
I think this PR will need quite a bit of refactoring for downstream code as well, which I haven’t started yet, but will do so soonish.
Instead of maintaining two separate fields, this merges the previous
_files
which was a dict from filename tototals
, and the_chunks
which was a list of split chunks, themselves optionally a fully parsedReportFile
.Maintaining only a single structure for the list of files makes a ton of internals of the
Report
simpler.It also solves the problem of
file_totals
getting out of sync with theReportFile.totals
.Instead of maintaining a
FileSummary
dataclass, and a separateReportFile
, this merges these two structures together, and makes sure that aReport
always hasReportFile
s.Though the new
ReportFile
has a bit more logic around making parsing of its actualchunk
contents on-demand only when needed, and avoid having to parse things when not strictly necessary.I think the ~40% perf regression is reasonable, as it is doing more work now for a "shallow" parse. The slowness comes from the
ReportFile
constructor being slow. Part of that is because it has to support various arguments and argument types which are primarily only used within tests. In real usage, we either parseReportFile
s from json/chunks, or start with an empty file that has lines added/replaced into it.