Skip to content

Commit

Permalink
Merge branch 'master' into berickson/makefile
Browse files Browse the repository at this point in the history
  • Loading branch information
bradley-erickson committed May 23, 2024
2 parents ea94a54 + 0e27266 commit 3470258
Show file tree
Hide file tree
Showing 42 changed files with 1,228 additions and 371 deletions.
27 changes: 27 additions & 0 deletions .github/workflows/versioning.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: Pytest

on:
push:
branches:
- master

jobs:
build:
runs-on: ubuntu-latest
steps:
- name: GitHub Script
uses: actions/github-script@v7.0.1
with:
# The script to run
script: |
const shortSHA = context.sha.substring(0, 7);
const date = new Date().toISOString().split('T')[0]; // Gets date in YYYY-MM-DD format
const formattedDate = date.replace(/-/g, '.'); // Replaces '-' with '.'
const tagName = `${formattedDate}-${shortSHA}`;
github.rest.git.createRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: `refs/tags/${tagName}`,
sha: context.sha
})
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
PACKAGES ?= wo,awe

run:
# If you haven't done so yet, run: make install
# we need to make sure we are on the virtual env when we do this
cd learning_observer && python learning_observer

Expand Down
4 changes: 3 additions & 1 deletion autodocs/development.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ Development
:parser: myst_parser.sphinx_
.. include:: ../modules/writing_observer/README.md
:parser: myst_parser.sphinx_
.. include:: ../docs/interactive_environments.md
:parser: myst_parser.sphinx_
.. include:: ../docs/privacy.md
:parser: myst_parser.sphinx_
.. include:: ../docs/config.md
Expand All @@ -20,4 +22,4 @@ Development
.. include:: ../docs/testing.md
:parser: myst_parser.sphinx_
.. include:: ../docs/technologies.md
:parser: myst_parser.sphinx_
:parser: myst_parser.sphinx_
96 changes: 96 additions & 0 deletions docs/interactive_environments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Interactive Environments

The Learning Observer can launch itself in a variety of ways. These
include launching itself stand-alone, or from within an IPython
kernel. The latter allows for users to directly interact with the LO
system. Users can connect to the kernel through the command line or a
Jupyter clients, such as the `ipython` console, Jupyter Lab, or
Jupyter Notebooks. This is useful for debugging or rapidly prototyping
within the system. When starting a kernel, the Learning Observer
application can be started alongside the kernel.

## IPython Kernel Commmunications

We will give an overview of the IPython kernel architecture, and how
we fit in. First, the IPython kernel architecture:

1. The `IPython` kernel handles event loops for different
commmunications that occur within itself.
1. These event loops handle code requests from the user or shutdown
requests from a system message.
1. The events are communicated using the [ZMQ Protocol](https://zeromq.org/).

There are 5 dedicated sockets for communications where events occur:

1. **Shell**: Requests for code execution from clients come in
1. **IOPub**: Broadcast channel which includes all side effects
1. **stdin**: Raw input from user
1. **Control**: Dedicated to shutdown and restart requests
1. **Heartbeat**: Ensure continuous connection

Upon startup, we create a separate thread to subscribe to, monitor and
log events on the information rich IOPub socket.

## Files

* The **kernel file** is [describe, provide a simplified example or link to one]
* The **connection file** is [describe, provide a simplified example or link to one]

## Launching Learning Observer from an IPython Kernel

We use an
[aiohttp runner](https://docs.aiohttp.org/en/stable/web_reference.html#running-applications)
to serve the LO application through the internal `ipykernel.io_loop`
[Tornado](https://www.tornadoweb.org/en/stable/) event loop. The
runner method attaches itself to the provided event loop instead of
the normal running method which creates a new event loop.

## IPython Shell/Kernel

We can startup the server as:

* A kernel we connect to (e.g. from Jupyter Lab)
* An interactive shell including a kernel

```bash
# Start an interactive shell
python learning_observer/ --loconsole
```

The IPython kernel parses specific arguments, which we should
block. It also does not like blank arguments (e.g. --bar).

```bash
# Start an ipython kernel
# note: the 1 is needed to make the ipython kernel instance we launch happy
python learning_observer/ --lokernel 1
# this will provide you a specific kernel json file to use
# Connect to the specified kernel
jupyter console --existing kernel-123456.json
```

## Jupyter Clients

### Connect with Jupyter

Jupyter clients have a set of directories they will look for kernels in.
We need to create the LO kernel files so the client will be able to choose the LO kernel.
Running the LO platform once will automatically create the kernel file in the `/<virtual_env>/share/jupyter/kernels/` directory.

```bash
# run once to create the kernel file
python learning_observer/
# open jupyter client of your choice
jupyter lab
# select the LO kernel from the kernel dropdown
```

### Helpers

The system offers some helpers for working with the LO platform from a Jupyter client.
The `local_reducer.ipynb` is an example notebook where we create a simple `event_count` reducer and create a corresponding dashboard.
This notebook calls `jupyter_helpers.add_reducer_to_lo` which handles adding your created reducer to all relavant aspects of the system.

The goal here is to be able to rapidly prototype reducers, queries, and dashboards. In the longer term, we would like to be able to compile these into a module, and perhaps even inject them into a running Learning Observer system.

A long-term goal in building out this file is to have a smooth pathway from research code to production dashboards, using common tools and frameworks.
2 changes: 1 addition & 1 deletion learning_observer/learning_observer/auth/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,4 +111,4 @@ def verify_auth_precheck():
"If you are not planning to use Google auth (which is the case for most dev\n" + \
"settings), please disable Google authentication in creds.yaml by\n" + \
"removing the google_auth section under auth."
raise learning_observer.prestartup.StartupCheck(error)
raise learning_observer.prestartup.StartupCheck("Auth: " + error)
2 changes: 1 addition & 1 deletion learning_observer/learning_observer/auth/roles.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,6 @@ def validate_user_lists():
paths.data(USER_FILES[k])
)
raise learning_observer.prestartup.StartupCheck(
f"Created a blank {k} file: static_data/{USER_FILES[k]}\n"
f"Created a blank {k} file: {paths.data(USER_FILES[k])}\n"
f"Populate it with {k} accounts."
)
51 changes: 47 additions & 4 deletions learning_observer/learning_observer/auth/social_sso.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,35 @@
import learning_observer.constants as constants
import learning_observer.exceptions

import pmss
# TODO the hostname setting currently expect the port
# to specified within the hostname. We ought to
# remove the port and instead use the port setting.
pmss.register_field(
name="hostname",
type=pmss.pmsstypes.TYPES.hostname,
description="The hostname of the LO webapp. Used to redirect OAuth clients.",
required=True
)
pmss.register_field(
name="protocol",
type=pmss.pmsstypes.TYPES.protocol,
description="The protocol (http / https) of the LO webapp. Used to redirect OAuth clients.",
required=True
)
pmss.register_field(
name="client_id",
type=pmss.pmsstypes.TYPES.string,
description="The Google OAuth client ID",
required=True
)
pmss.register_field(
name="client_secret",
type=pmss.pmsstypes.TYPES.string,
description="The Google OAuth client secret",
required=True
)


DEFAULT_GOOGLE_SCOPES = [
'https://www.googleapis.com/auth/userinfo.profile',
Expand All @@ -60,6 +89,20 @@
'https://www.googleapis.com/auth/classroom.announcements.readonly'
]

# TODO Type list is not yet supported by PMSS 4/24/24
# pmss.register_field(
# name='base_scopes',
# type='list',
# description='List of Google URLs to look for.',
# default=DEFAULT_GOOGLE_SCOPES
# )
# pmss.register_field(
# name='additional_scopes',
# type='list',
# description='List of additional URLs to look for.',
# default=[]
# )


async def social_handler(request):
"""Handles Google sign in.
Expand Down Expand Up @@ -91,10 +134,10 @@ async def _google(request):
if 'error' in request.query:
return {}

hostname = settings.settings['hostname']
protocol = settings.settings.get('protocol', 'https')
hostname = settings.pmss_settings.hostname()
protocol = settings.pmss_settings.protocol()
common_params = {
'client_id': settings.settings['auth']['google_oauth']['web']['client_id'],
'client_id': settings.pmss_settings.client_id(types=['auth', 'google_oauth', 'web']),
'redirect_uri': f"{protocol}://{hostname}/auth/login/google"
}

Expand Down Expand Up @@ -125,7 +168,7 @@ async def _google(request):
url = 'https://accounts.google.com/o/oauth2/token'
params = common_params.copy()
params.update({
'client_secret': settings.settings['auth']['google_oauth']['web']['client_secret'],
'client_secret': settings.pmss_settings.client_secret(types=['auth', 'google_oauth', 'web']),
'code': request.query['code'],
'grant_type': 'authorization_code',
})
Expand Down
2 changes: 1 addition & 1 deletion learning_observer/learning_observer/cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ def connect_to_memoization_kvs():
'key in `creds.yaml`.\n'\
'```\nmemoization:\n type: stub\n```\nOR\n'\
'```\nmemoization:\n type: redis_ephemeral\n expiry: 60\n```'
raise learning_observer.prestartup.StartupCheck(error_text)
raise learning_observer.prestartup.StartupCheck("KVS: "+error_text)


def async_memoization():
Expand Down
13 changes: 12 additions & 1 deletion learning_observer/learning_observer/dashboard.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import json
import jsonschema
import numbers
import pmss
import queue
import time

Expand Down Expand Up @@ -35,6 +36,16 @@
import learning_observer.communication_protocol.schema
import learning_observer.settings

pmss.register_field(
name='dangerously_allow_insecure_dags',
type=pmss.pmsstypes.TYPES.boolean,
description='Data can be queried either by system defined execution DAGs '\
'(directed acyclic graphs) or user created execution DAGs. '\
'This is useful for developing new system queries, but should not '\
'be used in production.',
default=False
)


def timelist_to_seconds(timelist):
'''
Expand Down Expand Up @@ -439,7 +450,7 @@ async def dispatch_named_execution_dag(dag_name, funcs):

async def dispatch_defined_execution_dag(dag, funcs):
query = None
if not learning_observer.settings.settings.get('dangerously_allow_insecure_dags', False):
if not learning_observer.settings.pmss_settings.dangerously_allow_insecure_dags():
debug_log(await dag_submission_not_allowed())
funcs.append(dag_submission_not_allowed())
return query
Expand Down
6 changes: 3 additions & 3 deletions learning_observer/learning_observer/google.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,8 @@ async def raw_google_ajax(runtime, target_url, **kwargs):
request = runtime.get_request()
url = target_url.format(**kwargs)
user = await learning_observer.auth.get_active_user(request)
if constants.AUTH_HEADERS not in request:
raise aiohttp.web.HTTPUnauthorized(text="Please log in") # TODO: Consistent way to flag this

cache_key = "raw_google/" + learning_observer.auth.encode_id('session', user[constants.USER_ID]) + '/' + learning_observer.util.url_pathname(url)
if settings.feature_flag('use_google_ajax') is not None:
Expand All @@ -139,8 +141,6 @@ async def raw_google_ajax(runtime, target_url, **kwargs):
GOOGLE_TO_SNAKE
)
async with aiohttp.ClientSession(loop=request.app.loop) as client:
if constants.AUTH_HEADERS not in request:
raise aiohttp.web.HTTPUnauthorized(text="Please log in") # TODO: Consistent way to flag this
async with client.get(url, headers=request[constants.AUTH_HEADERS]) as resp:
response = await resp.json()
learning_observer.log_event.log_ajax(target_url, response, request)
Expand Down Expand Up @@ -192,7 +192,7 @@ def connect_to_google_cache():
'```\ngoogle_cache:\n type: filesystem\n path: ./learning_observer/static_data/google\n'\
' subdirs: true\n```\nOR\n'\
'```\ngoogle_cache:\n type: redis_ephemeral\n expiry: 600\n```'
raise learning_observer.prestartup.StartupCheck(error_text)
raise learning_observer.prestartup.StartupCheck("Google KVS: " + error_text)


def initialize_and_register_routes(app):
Expand Down
22 changes: 19 additions & 3 deletions learning_observer/learning_observer/incoming_student_event.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,8 @@
import os
import time
import traceback
import urllib.parse
import uuid
import socket
import weakref

import aiohttp

Expand Down Expand Up @@ -191,6 +190,10 @@ async def handler(request, client_event):
filename, preencoded=True, timestamp=True)
await pipeline(event)

# when the handler garbage collected (no more events are being passed through),
# close the log file associated with this connection
weakref.finalize(handler, log_event.close_logfile, filename)

return handler


Expand Down Expand Up @@ -282,6 +285,8 @@ async def decode_and_log_event(events):
json_event = json.loads(msg.data)
log_event.log_event(json_event, filename=filename)
yield json_event
# done processing events, can close logfile now
log_event.close_logfile(filename)
return decode_and_log_event


Expand All @@ -307,6 +312,7 @@ async def incoming_websocket_handler(request):
await ws.prepare(request)
lock_fields = {}
authenticated = False
reducers_last_updated = None
event_handler = failing_event_handler

decoder_and_logger = event_decoder_and_logger(request)
Expand Down Expand Up @@ -334,14 +340,15 @@ async def update_event_handler(event):
if not authenticated:
return

nonlocal event_handler
nonlocal event_handler, reducers_last_updated
if 'source' in lock_fields:
debug_log('Updating the event_handler()')
metadata = lock_fields.copy()
else:
metadata = event
metadata['auth'] = authenticated
event_handler = await handle_incoming_client_event(metadata=metadata)
reducers_last_updated = learning_observer.stream_analytics.LAST_UPDATED

async def handle_auth_events(events):
'''This method checks a single method for auth and
Expand Down Expand Up @@ -406,6 +413,14 @@ async def filter_blacklist_events(events):
await ws.send_json(bl_status)
await ws.close()

async def check_for_reducer_update(events):
'''Check to see if the reducers updated
'''
async for event in events:
if reducers_last_updated != learning_observer.stream_analytics.LAST_UPDATED:
await update_event_handler(event)
yield event

async def pass_through_reducers(events):
'''Pass events through the reducers
'''
Expand All @@ -421,6 +436,7 @@ async def process_ws_message_through_pipeline():
events = decode_lock_fields(events)
events = handle_auth_events(events)
events = filter_blacklist_events(events)
events = check_for_reducer_update(events)
events = pass_through_reducers(events)
# empty loop to start the generator pipeline
async for event in events:
Expand Down
Loading

0 comments on commit 3470258

Please sign in to comment.