Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tracklet computation when re-computing science data #535

Merged
merged 8 commits into from
Jan 12, 2022

Conversation

JulienPeloton
Copy link
Member

IMPORTANT: Please create an issue first before opening a Pull Request.
Linked to issue(s): Closes #529

What changes were proposed in this pull request?

This PR adds the computation of tracklet during the reprocessing step, and speed-up the computation:

  • Add add_tracklet_information inside the reprocessing
  • Use DataFrame filter function instead of Pandas UDF to speed-up the filtering
  • Broadcast tracklet DataFrame for the join to speed-up

The second point shows an interesting behaviour -- it seems that apply_user_defined_filter was using drb and not rb... If this is true (TBD), we should switch using drb and filter everywhere.

How was this patch tested?

Manually and CI

@sonarcloud
Copy link

sonarcloud bot commented Jan 12, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

@codecov
Copy link

codecov bot commented Jan 12, 2022

Codecov Report

Merging #535 (d1eaa02) into master (5c69071) will decrease coverage by 0%.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff          @@
##           master   #535   +/-   ##
=====================================
- Coverage      95%    95%   -1%     
=====================================
  Files          20     20           
  Lines         989    989           
=====================================
- Hits          943    942    -1     
- Misses         46     47    +1     
Impacted Files Coverage Δ
bin/merge_ztf_night.py 100% <100%> (ø)
fink_broker/tracklet_identification.py 90% <100%> (-2%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5c69071...d1eaa02. Read the comment docs.

@JulienPeloton JulienPeloton merged commit 9fb038b into master Jan 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Tracklet] add tracklet compute when recreating science data
1 participant