Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ClickHouse event sink support #35

Merged
merged 14 commits into from
May 15, 2023
Merged

Add ClickHouse event sink support #35

merged 14 commits into from
May 15, 2023

Conversation

bmtcril
Copy link
Contributor

@bmtcril bmtcril commented May 2, 2023

Replaces all "coursegraph" work with the new event-sink-clickhouse. Mostly this is renaming things, but it does install the event-sink-clickhouse plugin and configure it to talk to the Tutor-configured ClickHouse. Unfortunately this means that many downstream Superset assets needed to be deleted since they referenced the old names of things. New reports will be created as part of a separate set of tasks that Axim is contracting out.

@bmtcril bmtcril requested a review from pomegranited May 2, 2023 23:09
tutoroars/plugin.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Ian2012 Ian2012 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Makefile Show resolved Hide resolved
tutoroars/patches/openedx-cms-common-settings Outdated Show resolved Hide resolved
tutoroars/patches/openedx-cms-common-settings Outdated Show resolved Hide resolved
tutoroars/plugin.py Outdated Show resolved Hide resolved
tutoroars/plugin.py Outdated Show resolved Hide resolved
@Ian2012
Copy link
Contributor

Ian2012 commented May 9, 2023

@bmtcril Can you rebase this branch?

Removing the charts, dashboards, datasets, and hard coded data for the coursegraph implementation we used for the demo. The new implementation is in the works and will have different table names, meaning all of these will need to be regenerated anyway.
This new package replaces the temporary coursegraph hack. This adds configuration for it to talk to our configured ClickHouse, handles table creation, and renames the coursegraph variables.
Type checking doesn't get us much in this repo since it's a lot of boilerplate.
openedx-event-sink-clickhouse has updated changes, so we need to change these columns and names to match.
This is just for testing. Once that branch merges we will release the package and update this PR to use the PyPI package here.
Also fix columns that were incorrectly nullable and fix up some primary key and sort columns.
@mariajgrimaldi
Copy link
Member

mariajgrimaldi commented May 10, 2023

I know the testing instructions are using a local env instead of a development one but either way, I tried with one, and it's failing when initializing:

tutor dev do init
...
Running schema creation scripts...
Code: 210. DB::NetException: Connection refused (clickhouse:9000). (NETWORK_ERROR)

Is dev not supported yet?

Update: this might be related to the missing table issue.

@Ian2012
Copy link
Contributor

Ian2012 commented May 10, 2023

@bmtcril I think you are missing adding the course_overview table schema in the clickhouse init job

Copy link
Contributor

@pomegranited pomegranited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmtcril This is working great, with two small exceptions:

  1. If we're going to remove all the dashboards, we also need to update the superset_dashboard.py init script:

    diff --git a/tutoroars/templates/oars/apps/pythonpath/superset-dashboard.py b/tutoroars/templates/oars/apps/pythonpath/superset-dashboard.py
    index c535202..b4e8925 100644
    --- a/tutoroars/templates/oars/apps/pythonpath/superset-dashboard.py
    +++ b/tutoroars/templates/oars/apps/pythonpath/superset-dashboard.py
    @@ -61,11 +61,13 @@ def update_assets():
             os.unlink(zip_file.name)
    
         # Mark the imported dashboard as Published
    -    dashboard = superset.dashboards.find(slug=OPENEDX_DASHBOARD_SLUG)[0]
    -    dashboard.published = True
    -    # TODO: Enable feature flag DASHBOARD_RBAC, and set dashboard.roles = ["Open edX"]
    -    # Consider finishing https://github.com/opus-42/superset-api-client/pull/31
    -    dashboard.save()
    +    openedx_dashboards = superset.dashboards.find(slug=OPENEDX_DASHBOARD_SLUG)
    +    if openedx_dashboards:
    +        dashboard = openedx_dashboards[0]
    +        dashboard.published = True
    +        # TODO: Enable feature flag DASHBOARD_RBAC, and set dashboard.roles = ["Open edX"]
    +        # Consider finishing https://github.com/opus-42/superset-api-client/pull/31
    +        dashboard.save()
  2. I got errors about a missing RALPH_HTTP_PROTOCOL variable, so pulled in the changes from feat: add support for clickhouse cloud #38, and that fixed them.

👍

  • I tested this with a fresh Tutor local devstack, and by following the PR instructions.
  • I read through the code
  • I checked for accessibility issues N/A
  • Includes documentation N/A
  • Commit structure follows OEP-0051

tutoroars/plugin.py Outdated Show resolved Hide resolved
@bmtcril
Copy link
Contributor Author

bmtcril commented May 11, 2023

@mariajgrimaldi I just saw the issue with clickhouse connection that you saw on dev. It happened because I was trying to init before clickhouse was actually started. I tried it 2 secs later and it worked. I'm not sure how to solve this without putting a health check in, which we've seen requires a newer Docker Compose. 🤔

@bmtcril
Copy link
Contributor Author

bmtcril commented May 11, 2023

I think this is ready for final review if anyone else wants to take a look before tomorrow morning.

…sion

chore: pin event-routing-backends to 5.2.2
@Ian2012
Copy link
Contributor

Ian2012 commented May 12, 2023

@bmtcril Is it ready for review?

cc @mariajgrimaldi

@bmtcril
Copy link
Contributor Author

bmtcril commented May 12, 2023

@Ian2012 @mariajgrimaldi yeah everything I know of is fixed

@Ian2012
Copy link
Contributor

Ian2012 commented May 12, 2023

@bmtcril Just one more thing, can you upgrade the event-routing-backends requirements to 5.3.0?

This brings in some performance improvements and idempotent xAPI event ids!
@bmtcril
Copy link
Contributor Author

bmtcril commented May 15, 2023

I know that ERB 5.3.0 is messed up on Nutmeg, but I think in order to keep things moving I'll merge this and try to either fix that forward or roll back to the last release in another PR if that is what's needed.

@bmtcril bmtcril merged commit 170a374 into main May 15, 2023
2 checks passed
@bmtcril bmtcril deleted the bmtcril/add_event_sink branch May 15, 2023 12:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked by other work PR cannot be finished until other work is complete
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants