Skip to content

Hedy Architecture

Jesús Pelay edited this page Jun 13, 2024 · 6 revisions

Welcome to the Hedy Architecture document. Here we try to describe (in a lot of detail) how Hedy works internally and how parts fit together. It is not meant as a document to read at once in one sitting, it is more a place to look for information in case it is needed.

High-level overview

The basic idea of Hedy is that we allow learners to write code using the Hedy syntax, in a web page. The code page, and other web pages, are rendered by Flask (a Python web framework), using templates in Jinja.

When running Hedy code from the webpage, the Hedy code is converted (transpiled) into Python on the server side, using the Lark parser framework. The transpiled Python code is sent to the front-end and run in the front-end using Skulpt, a Python interpreter written in JavaScript.

architecture1

Transpiling

Transpiling/transpilation means to generate code in one high-level language into another, in our case from Hedy to Python.

It is an alternative to compiling/compilation, meaning to translate from a high-level language into bytecode (as for example C does) and interpreting, which means executing the language on a runtime directly (as Python does).

From Hedy to Python

The global idea of Hedy is that we transpile Hedy code to Python code by adding syntactic elements when they are missing. In the code base, this works in steps:

  1. Hedy code is parsed based on the relevant grammar, creating an AST. This is done using the open source Lark parser.
  2. Validity of the code is checked with the IsValid() function
  3. If the code is valid, it is transformed into Python with the relevant function.

This logic all resides in hedy.py.

The transpilation of Hedy code to Python is a stepwise process. Firstly the code is parsed using Lark, resulting in an AST. The AST is then scanned for invalid rules. If these appear in the tree, the Hedy program is invalid and an error message will be generated. Secondly, a lookup table with all variable names occuring in the program is extracted from the AST. Finally, the AST is transformed into Python by adding needed syntax such as brackets.

Skipping Faulty Code

When the AST is scanned for invalid rules and actually contains an error, an exception is thrown. We catch the exception and transpile the code again but this time we allow 'invalid' code that we are going to skip. If this also fails, we raise the original exception, if it succeeds, the error(s) will be caught by the source-mapper and therefore be mapped. We go through all the errors and transpile again without allowing 'invalid' code, we ultimately get the original exception per mapping. This we return to the user along-side the partially functional code.

Technologies

The technologies used in building Hedy are the following:

Backend: Python, Lark, Flask, Jinja2, DynamoDB

The backend is a web server, implemented in Python using the Flask framework. Flask is used to render all pages on the website, such as the code page where programming happens, or the login page. To render pages, the templating framework Jinja2 is used. Templating engines use HTML pages that are partly filled, filling in parts of the with data HTML at runtime.

For some things, like storing user data or programs, the backend accesses the database to load and store data. Online in the live web application, data is stored in AWS DynamoDB. Locally we use a json text file. The most central part of the back-end is our transpiler created with the Lark parser framework, which transforms Hedy into Python code that can be executed on the front-end.

All these technologies are explain in more depth below, please read on for a deeper dive!

Resources:

Front-end: HTML, Tailwind, HTMX, HyperScript, TypeScript, Skulpt

Most commonly, contributors are most comfortable with Python, so when given the choice between implementing something on the front-end and implementing something on the back-end, we tend to choose the back-end. Thus, the front-end is preferably plain HTML and CSS, with a minimal amount of client-side code (written in TypeScript).

In-page interactivity is added by using either HTMX or HyperScript:

  • HTMX adds the ability to make calls to the back-end using any HTML element, and when doing so it is also capable of doing an inline page update, this allows to add interactivity by writing Python code, and prevents us from using Typescript.
  • HyperScript is like JavaScript but more expressive, and can be used to add light interactivity to the page.

Here's some tips on when to use which technology:

  • If sorting or filtering data, use Python and render a template with Flask/Jinja2.
  • If clicking a button needs to read or write something to the database, use HTMX.
  • If clicking a button needs to show or hide some page element, or change the style of some other element, use HyperScript.

For more advanced use cases that cannot be solved using any of these existing mechanism, new TypeScript can be added instead. However, this should rarely be necessary. There is quite a lot of TypeScript present in the code, which will ideally be less over time.

For CSS, we use Tailwind. Tailwind is a utility-based class framework, which means there are a lot of classes to set individual CSS properties. Usually, you put those classes directly into the HTML. In case it's desirable or necessary, a combination of styles can be made into a new class using a build step.

Jinja2 Templates

The main source of how we control what ends up on the page is by using Jinja2 templates. Most Flask routes end in a render_template('example.html', my_variable=my_variable) call, which would take the file templates/example.html, substitute any variables in it (for example, using the value {{ my_variable }}, and send it to the user's browser.

Control flow in Jinja2

Jinja2 offers us two ways of control flow in our templates: the {% for %} loop and the {% if %} conditional. These can be used to control the HTML that gets rendered in the HTML page:

For loops

The {% for %} loops allow us to iterate over collections of data inside our templates, much like the for loop in Python. This loop is used when you need to display the same (or similar) HTML elements a number of times, for each element in a collection. Even if the collection is a fixed size, or the elements are slightly different each time (for example, each has a different color), you can still use for loops. Have the Python code that provides the variables for the template precalculate the attributes that are different, or use the cycle function. For example, the quiz and the front page both use loops, even though the elements are different every time, as shown in the following snippet:

{% for section in content %}
    <div class="{{loop.cycle('stripe-white', 'stripe-colored')}} py-16">
{% endfor %}

Notice how in the last snippet, the loop was closed with the {% endfor %}. It's mandatory to use this, otherwise Jinja won't know where the loop ends and will complain.

This is another example of how you can use a for loop, this time iterating over a list that comes from the Python code

{% for student in class_info.students %}
  <tr>
    <td>{{student.username}}</td>
    <td>{{student.last_login}}</td>
    <td>{{student.highest_level}}</td>
    <td>{{student.programs}}</td>
    <td><a href="/programs?user={{student.username}}">{{student.username}}'s Page</a></td>
</tr>
{% endfor %}

Conditionals

{% if %} sentences are another way in which we can control what data is displayed in our pages. A very common use case is to check whether a variable is set, and therefore use its contents:

{% if invites %}  
  {% for invite in invites %}
        <tr>
            <td>{{invite.username}}</td>
            <td>{{invite.timestamp}}</td>
            <td>{{invite.expire_timestamp}}</td>
        </tr>
    {% endfor %}
{% endif %}

It's important to note that both of these sentences can be placed anywhere in the HTML code, so for example, if we want to display a red background if the button is disabled, and blue otherwise, we can make something like this:

<button class="{% if disabled %}bg-red{% else %}bg-blue{% endif %}">Submit</button>

Using HTMX to handle interaction

HTMX is a library that, among other thing, gives you access to AJAX calls from every HTML element, instead of just from anchor tags, buttons and forms, it also lets you replace just parts of the screen with the response from the server, and not the entire screen as you would do in a classical web page; this allows us to handle interactivity from the server, since any event can trigger a call to the server, and it will respond with HTML that will be swapped in. We'll try to cover up some basics here, but if you want to take a deeper dive into HTMX you can do it here.

The basis of HTMX is a series of attributes that can be placed inside an element to trigger the AJAX request, for example, we can trigger a post to the server when there changes to an input element, or clicking on a select, for example this code that is taken from customize-class.html, (with the non important bits left out) lets you change the element (and all of its child elements) with the id adventure-dragger:

<select name="level"        
        hx-get="/for-teachers/get-customization-level"
        hx-target="#adventure-dragger"
        hx-trigger="input"
        hx-indicator="#indicator">
    {% for i in range(1, max_level + 1) %}
    <option value="{{ i }}">
        {{ _('level_title') }} {{ i }}
    </option>
    {% endfor %}
</select>

There are a few aspect that we need to point out:

  1. hx-get means that will do a get request to that endpoint.
  2. hx-target means that the HTML that will be retrieved from the server will replace the element that matches that id.
  3. hx-trigger is the event that will trigger the AJAX call to the server, in this case is input.
  4. hx-indicator is the id of the the indicator that will show up as the HTMX call is being processed, so the user knows what is happening. This is optional.
  5. hx-confirm is a message that's used to prompt the user before we issue a request. This attribute should be added whenever you need a confirmation from the user regarding a specific action. Our Typescript will handle it automatically for all elements with this attribute present.

After that call reaches the server, it will process the data and return HTML (notice that we are not dealing with JSON here), that will be swapped in. Now you might be asking how do we pass the value of the option we just selected? There's no body in that request. For this HTMX, in certain elements, automatically picks the value of elements and sends them to the server in a form or in the arguments of the call under the name that you put in the name attribute. In this case the code that handle this in the server looks like this:

@route("/get-customization-level", methods=["GET"])
@requires_login
def change_dropdown_level(self, user):
	level = request.args.get('level')

As you can see we retrieve the data using the value level which is value of the name attribute of the select.

Reducing code duplication in templates

Jinja 2 provides a few ways to help us reduce code duplication in our templates, making them more organized:

  • {% extends %} and {% block %}s. Use this if multiple pages have the same basic page structure, but have placeholders where different types of content are injected. For example, all pages ultimately extend layout.html, which includes the menu bar, the CSS and all scripts.
  • {% include %}. Use this either to reuse small snippets of HTML across multiple pages, or to separate out a bit of HTML to a different file for better code organization and readability. Files that are designed to be included (rather than used in a call to render_template). For example, the quiz has incl-question-progress.html, which is used on multiple pages to render the UI that indicates the current question number. Alternatively, menubar.html is only included from one place, but by splitting it off into a separate file the code for it is easy to find.
  • {% macro xyz(...) %}. Macros are like function calls: they are a way to define a paramaterized template fragment that can be instantiated multiple times with different values. This is useful if you want to reduce duplication but the reused code isn't significant enough to warrant its own file. Macros can be defined in includable files to make libraries of reusable snippets (if you are planning to go this route, try to explore simpler options first). For example, adventure-tabs.html has a macro to render a tab, which gets called multiple times with multiple arguments. macros/stats-shared.html is a template designed to be included that defines a bunch of macros that are used in the statistics pages.

Conventions in template organization

We use the following organization and naming conventions in the templates:

  • There are a lot of files in this directory. To keep it organized, prefer using a directory by feature or site area if possible.
  • Template files that are intended to be included from other templates either start with incl- or are in the incl/ directory.
  • Template files that are intended to be rendered from Python using render_template(), but in response to an HTMX request so they don't render a full HTML page, start with htmx-.

Where to put code

How to organize code

How to make common changes

How to add hover effects

How to add click effects, selection effects

Adding new tables

GET vs POST

The database

There are two important places where we store data: we store log data in s3 (when a user transpiles a program) and all other information (for example user accounts and the classes they are enrolled in) in DynamoDB. We refer to the latter here (and most often in the code) as "database". DynamoDB is a so called "no-SQL" database, and is therefore a little different than the SQL databases you are probably used to.

DynamoDB

The rows in a DynamoDB table are like JavaScript objects or Python dictionaries: a collection of key/value pairs. Unlike an SQL database, not all rows need to have the same keys, and not all rows need to have the same type for the same attribute. There is a lot of flexibility in a NoSQL database, but that puts more burden on the developer: the database does not support join operations, so you will have to implement those yourself; your application probably has expectations about the keys and types that make up a row, but your database is not going to help you ensure those. A bit of reasoning and diligent programming is required. This section explains the essence of DynamoDB's data model and operations to help you understand how to work with DynamoDB effectively.

The DynamoDB data model

Every table must have a partition key and may also have a sort key. Those fields must be present on every row that is saved, and together form the primary key of a row; all other fields may be added or left out as desired.

The operations you can do to rows in a table are:

  • put(object) - add a new row to the table, or completely overwrite an existing row if one exists already with the same primary key. There is also a batch_put variant of this operation which saves on network roundtrips for multiple rows.
  • get(primary_key) - retrieve a single row from the table, by its full primary key. There is also a batch_get variant of this operation.
  • delete(primary_key) - remove a row from the table by its full primary key. There is also a batch_delete variant of this operation.
  • query(partition_key, [sort_key_condition]) - search for any number of more rows in the table, by its exact partition key, and optionally a condition on its sort key. The sort key condition may be absent, in which case all rows with the same partition key are returned, or it may be a condition like sort_key == 5 (in which case query will return at most one element) or date >= '2023-01-10' (assuming that date is a sort key in this table). Rows are always returned either in order of the sort key, or in reverse order of that.
  • update(primary_key, updates, [conditions]) - update one or more fields in a single existing row, while leaving the other ones in place. Optionally you can add a condition, such as userid == 'alice' or row_version == 5, and the operation will fail if the condition is not met.
  • scan() - do not be smart, just return all rows of the table one by one.

As you can see from these operations, you can only retrieve objects by fields that have been marked as keys, and you always need at least the partition key. If you want to search for other fields, you can configure DynamoDB to create an index for another field.

An index behaves a lot like a table. It also has a partition key, and optionally a sort key. The difference is that the rows in an index are a copy of the rows in the actual data table, just keyed differently. When you create the index, you can decide whether you want to:

  • Copy all data fields to the index, effectively doubling the storage requirements of your table; or
  • Just copy the original table's primary key fields. After searching the index you will now need to follow up with a get on the original table to retrieve the full row, so this saves disk space but costs you an additional database query.

The only operation you can perform on an index is query.

Keys and querying

Every database operation has a cost: it both costs money and also costs time for the network request to be sent, serviced, and response received. You can expect every database operation you perform to add about ~5ms to the load time of a page for a simple query, and for query operations it depends on how many rows are returned (actual numbers indicate that selecting 10000 items from a query takes around 3 seconds). Those numbers are pretty good, but our web application needs to service thousands of users constantly and while the web server is waiting for DynamoDB to respond to one database query, other page load requests can't be handled. Never mind that waiting three seconds for a web page isn't a great experience.

The only way to control how much data is returned by a query is by judicious choice of the keys the data is stored under. In a NoSQL database, it is important to know how you're going to query the data before coming up with the key schema, so that you can optimize the schema for the query. A little preparation can turn a query that returns 10s of thousands of items and needs additional client side filtering or computation into a query that returns 1-5 items, or maybe even a single get. If you cannot make the table key schema itself into what you need to serve a page quickly, then try additional indexes with the key schema you need and query for those. You could even think of having a synthetic field that combines the values of two other fields, so that we can do a single query to search for multiple fields at the same time (as a practical example, our programs table has the field username_level: 'alice-7', so that we can quickly pinpoint all programs for a given user at a given level).

Try to expend as few queries as possible during every page load, and if you need information about multiple rows, try to find some way to store it precalculated in the database.

Type checking in the database

The type checking on our database is meant to type check all data inserted into the database in the emulation layer, locally on the developers machines. The utility of this is that we can prevent data errors from creeping into production, and also serves as documentation about the fields that go into our tables.

This is an example on how to use the type checking:

SURVEYS = dynamo.Table(storage, "surveys", "id",
                       types=only_in_dev({
                           'id': str,
                           'responses':  Optional(DictOf({
                               str: RecordOf({
                                   'answer': str,
                                   'question': str
                               })
                           }))
                       }),
                    )

So, in the left-hand side of the dictionary we can use the name of the fields, or, in the case that the fields have inner structure, but not a specific name we can use the DictOf to match any string as the field's name. In the right-hand side we might have any type accepted by JSON: numbers, strings, booleans, lists, dictionaries, and sets (which are a Dynamo specific type). There are two types of dictionaries: RecordOf will require the field's name to match exactly the ones we are inserting in the table, and DictOf which will match any string as a field name.

In the cases where a field can have two different types we can use EitherOf and more particularly if one of those types is None we can use Optional

Tables in the database (more to come, but these are the main ones!)

We store the following tables (: fields/columns).

  • users: epoch, password, teacher, classes, created, keyword_language, username, last_login, program_count, language, birth_year, email, prog_experience, country, gender, experience_languages, verification_pending, teacher_request, third_party, is_teacher, heard_about
  • classes: date, teacher, link, id, name, students
  • programs: version, date, code, adventure_name, session, level, username, id, name, lang, public, error, username_level, submitted, server_error
  • public_programs: image, personal_text, last_achievement, username, country, tags, achievements, favourite_program

Feeding the local database

If you are developing locally, you do not need to connect to the actual database. We use a local database in developing environments. This database is a text file (convenient, so you can inspect it easily!) called dev_database.py and it's not tracked by Git. To feed this local database you can use the one that's been filled with data already, data-for-testing.json, it contains:

  1. Five users, from user1 to user5.
  2. One teacher called teacher1 <-- with this account, you can test teacher facing features locally!
  3. Five students, from student1 to student5.
  4. A class called CLASS1.
  5. Several saved programs, quiz attempt and some users have achievements.

The password to all of the accounts is 123456

To feed the dev database with the data in this one, you can run:

doit run devdb

Adding new tables

Querying

Making a schema change

Syntax Highlighting

The syntax highlighting of Hedy works through the Lezer library, which is included with the CodeMirror editor we use in the front-end.

Lezer uses an incremental parsing approach, which means that it doesn't need to parse again the entirety of the document each time the user edits it. It also means that it doesn't produce a full fledged AST, but rather a tree that is efficient and compact.

The way Lezer works is by generating JavaScript modules with the parser code that is loaded in the application. To generate these parsers we write declarative grammars, located inside the highlighting/lezer-gramars folder.

An important aspect of our grammars stems from the fact that Hedy can be translated to several languages and keywords can be composed of multiple words, because of this, instead of just accepting a keyword a single time, we can actually accept it multiple times, like so:

Add { add+ Text toList+ Text }

In this rule, the keywords are add and toList, and the + sign means that they can be repeated 1 or more times. In practice this means we can accept strings like the following:

add add add 1 to to to list

But since we are not doing anything with the tree Lezer generates, it's ok to be a bit more lenient.

Another important aspect of keywords, is that they are defined in an external file called tokens.ts and imported in the grammars:

@external specialize { Text } specializeKeyword from "./tokens" {
    ask[@name="ask"],
    at[@name="at"],
    random[@name="random"],
    ifs[@name="if"],
    elses[@name="else"],
    pressed[@name="pressed"]
}

@external extend { Text } extendKeyword from "./tokens" {
    print[@name="print"],
    forward[@name="forward"],
    turn[@name="turn"],
    color[@name="color"],
    sleep[@name="sleep"],
    play[@name="play"],
    is[@name="is"],
    add[@name="add"],
    from[@name="from"],
    remove[@name="remove"],
    toList[@name="toList"],
    clear[@name="clear"],
    ins[@name="in"],
    not_in[@name="not_in"]
}

The difference between extend and specialize is that extend keywords will be detected as such only in the right context, and in any other context they can be used normally, wherein specialize keywords will always be considered keywords, not matter the context.

Writing Lezer tests

To test the Lezer grammars, we use the same framework we use to test the front-end code: cypress, therefore the Lezer tests are located in tests/cypress/e2e/lezer-tests and are composed like this:

describe('Lezer tests for level', () => { // This outer one groups the tests for the level
    describe('Successful tests', () => { // this one groups tests of a kind, for example sucessful tests
        describe('print test', () => { // The individual test
            code = 'print hello world' // the code you want to test
            expected = 'Program(Command(Print(print, Text, Text)))` // The tree generated by Lezer
            multiLevelTester('Test print with text', code, expectedTree, minLevel, maxLevel);  // can also be singleLevelTester
        })
    })
})

To be able to check the tree generated by Lezer in the front-end, add this code in the end of the setHighlighterForLevel in the file cm-editor.ts:

const transaction = this.view.state.update({
    effects: StateEffect.appendConfig.of(EditorView.updateListener.of((v: ViewUpdate) => {
        if (v.docChanged) {
            console.log(language.parse(v.state.doc.toString()).toString());
        }
    }))
})
this.view.dispatch(transaction);

How to reduce Flask duplication

  • g
  • Scope of variables
  • session
  • preprocessor