feat: allow filtering contributions and collaborations [IN-707]#3432
feat: allow filtering contributions and collaborations [IN-707]#3432
Conversation
b57d052 to
4ad3616
Compare
df4c21f to
e9d967f
Compare
|
@gaspergrom can you check @borfast comment from the PR?
Raúl raised this on the backend weekly sync. @epipav if you can also check this one out, I think it will help. This is a change that will affect almost all metrics we display on Insights. |
epipav
left a comment
There was a problem hiding this comment.
Overall looks ok - I added comments to a few places where we changed the parameter name from onlyContributions to includeX style - In these places where we consume the tinybird API, client should also be changed. Have we already made this change somewhere?
Also added a readability nit to the pipe
| DESCRIPTION > | ||
| - `activityTypes_filtered.pipe` allows filtering activityTypes from the respective data source. | ||
| - By default, this only returns code contribution activities (`includeCodeContributions = 1`). | ||
| - To return all activities, set `includeCodeContributions = 1`, `includeCollaborations = 1`, and `includeOtherContributions = 1`. | ||
| - Parameters: | ||
| - `includeCodeContributions`: Optional boolean to include code contribution activities. Defaults to 1. Set to 0 to exclude. | ||
| - `includeCollaborations`: Optional boolean to include or exclude collaboration activities. | ||
| - `includeOtherContributions`: Optional boolean to include other contribution activities (activities that are neither code contributions nor collaborations). | ||
| - Response: `activityType`, `platform`. | ||
| - This pipe is used by other downstream pipes as an auxiliary method of filtering data by activity types. | ||
|
|
||
| NODE activityTypes_selected | ||
| SQL > | ||
| % | ||
| SELECT activityType, platform | ||
| FROM activityTypes | ||
| WHERE | ||
| ( | ||
| -- If no parameters are defined, default to including code contributions. | ||
| {% if not defined(includeCodeContributions) and not defined( | ||
| includeCollaborations | ||
| ) and not defined(includeOtherContributions) %}isCodeContribution = 1 | ||
| {% else %} | ||
| -- Start with a false literal to safely prepend OR clauses in the next statements, | ||
| -- even if the previous guard didn't output anything. | ||
| 0 | ||
| {% if defined(includeCodeContributions) %} | ||
| OR ( | ||
| {{ | ||
| UInt8( | ||
| includeCodeContributions, | ||
| description="Include code contribution activities", | ||
| required=False, | ||
| ) | ||
| }} = 1 AND isCodeContribution = 1 | ||
| ) | ||
| {% end %} | ||
| {% if defined(includeCollaborations) %} | ||
| OR ( | ||
| {{ | ||
| UInt8( | ||
| includeCollaborations, | ||
| description="Include non-code collaboration activities", | ||
| required=False, | ||
| ) | ||
| }} = 1 AND isCollaboration = 1 | ||
| ) | ||
| {% end %} | ||
| {% if defined(includeOtherContributions) %} | ||
| OR ( | ||
| {{ | ||
| UInt8( | ||
| includeOtherContributions, | ||
| description="Include other contribution activities", | ||
| required=False, | ||
| ) | ||
| }} = 1 AND isCodeContribution = 0 AND isCollaboration = 0 | ||
| ) | ||
| {% end %} | ||
| {% end %} | ||
| ) |
There was a problem hiding this comment.
Readability NIT: If we get the parameters to the query using CTEs, it can be shorter and easier to read
%
WITH
{{ UInt8(includeCodeContributions, default=1) }} AS icc,
{{ UInt8(includeCollaborations, default=0) }} AS icol,
{{ UInt8(includeOtherContributions, default=0) }} AS ioc
SELECT activityType, platform
FROM activityTypes
WHERE
(icc = 1 AND isCodeContribution = 1)
OR (icol = 1 AND isCollaboration = 1)
OR (ioc = 1 AND isCodeContribution = 0 AND isCollaboration = 0)
There was a problem hiding this comment.
Good call, I'll change it 👍
| {% if not defined(onlyContributions) or ( | ||
| defined(onlyContributions) and onlyContributions == 1 | ||
| ) %} AND a.isContribution {% end %} | ||
| AND (a.type, a.platform) IN (SELECT activityType, platform FROM activityTypes_filtered) |
There was a problem hiding this comment.
this'll break on the client part because client still uses onlyContributions ?
| not defined(onlyContributions) | ||
| or (defined(onlyContributions) and onlyContributions == 1) | ||
| ) %} AND a.isContribution {% end %} | ||
| AND (a.type, a.platform) IN (SELECT activityType, platform FROM activityTypes_filtered) |
There was a problem hiding this comment.
this'll break on the client part because client still uses onlyContributions ?
| {% if not defined(onlyContributions) or ( | ||
| defined(onlyContributions) and onlyContributions == 1 | ||
| ) %} AND a.isContribution {% end %} | ||
| AND (a.type, a.platform) IN (SELECT activityType, platform FROM activityTypes_filtered) |
There was a problem hiding this comment.
this'll break on the client part because client still uses onlyContributions ?
Yes, there'a a PR for Insights that covers this. As far as I know, there's no other client that uses |
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
d10a98b to
8430f7b
Compare
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
| WHERE | ||
| (icc = 1 AND isCodeContribution = 1) | ||
| OR (icol = 1 AND isCollaboration = 1) | ||
| OR (ioc = 1 AND isCodeContribution = 0 AND isCollaboration = 0) |
There was a problem hiding this comment.
This replaces the
isContributionfilter in the pipes that use activities, and adds the ability to filter by activity type "category", by including a newincludeCodeContributionsandincludeCollaborationsparameters.By default, if neither parameter is passed to the pipe, it considers
includeCodeContributions = 1.The changes in
services/libs/tinybird/pipes/top_member_org_copy.pipewere done so that the "Top contributors" and "Top organizations" widgets in the app front page use filtered data.I'm not sure what to do for the "Top LF projects" widget, though. It uses the
projects_listpipe from Tinybird, but that one does not touch activities, so we can't filter its data by activity type directly. The pipe gets its data from theinsightsProjects_filteredpipe, which in turn gets its data frominsights_projects_populated_dsdata source, and that one is populated by a copy pipe, which does get activities data - but if we filter the data before it is copied, we affect every other pipe downstream from that one, which is no good.Note
Replaces isContribution filters with activity type-based filtering via new activityTypes_filtered pipe and updates related pipes/widgets to support includeCodeContributions/includeCollaborations.
isContributionchecks withAND (type, platform) IN (SELECT activityType, platform FROMactivityTypes_filtered)in:activities_filtered.pipe,activities_filtered_historical_cutoff.pipe,activities_filtered_retention.pipehealth_score_active_contributors.pipe,health_score_contributor_dependency.pipe,health_score_organization_dependency.pipesegmentId_aggregates_mv.pipeactivityTypes_filtered:includeCodeContributions,includeCollaborations(and docs tweaks torepos).activityTypes_filtered.pipe: parameterized filter overactivityTypeswithincludeCodeContributions(default 1),includeCollaborations,includeOtherContributions; outputs(activityType, platform).top_member_org_copy.pipe: constrain member/org activity counts to code contributions or collaborations usingactivityTypes(exclude other activity types).Written by Cursor Bugbot for commit 1b769fa. This will update automatically on new commits. Configure here.