Kuba Ober edited this page Feb 3, 2017 · 97 revisions


A Notational Usability Benchmark for GUI Programming

7GUIs is a project concerned with the comparison of programs for a set of seven GUI related programming tasks that represent fundamental challenges in GUI programming.

There are countless GUI toolkits in different languages with diverse approaches to GUI development yet diligent comparisons between these approaches are rare. 7GUIs is an attempt to provide a common basis for these comparisons. Whereas in a traditional benchmark competing implementations are primarily compared in terms of their runtime and memory requirements, 7GUIs can be understood as a kind of benchmark where the competing implementations are compared in terms of usability aspects of the underlying source code (the notation) behind the resulting GUI applications (and not the GUI applications themselves). To that end, 7GUIs also provides a recommended set of evaluation dimensions whereby comparisons can be made more uniform.

One might wonder why such a project is useful. First, GUI programming is in fact not an easy task. The code behind GUI programs tends to become messy fast. Identifying better approaches to GUI programming could lead to the propagation of (new) useful GUI programming concepts that might make the life of programmers easier. Second, alternative approaches to GUI programming and programming in general gained in popularity. It would be interesting to see what advantages and disadvantages these alternatives have in contrast to the traditional OOP & MVC GUI development approach in practical terms. Third, as mentioned above there was no (quasi) standardized set of tasks which represent typical GUI programming challenges. Finally, the aspect of programming language usability from an explorative and analytical perspective is an interesting research area in itself.

The Seven Tasks
 Temperature Converter
 Flight Booker
 Circle Drawer
Dimensions of Evaluation
Additional Tasks Related Links

The Seven Tasks

To provide a useful basis the tasks were selected by the following criteria. The task set should be as small as possible yet reflect as many fundamental (or typical or representative) challenges in GUI programming as possible. Each task should be as simple and self-contained as possible yet not too artificial. Preferably, a task should be based on existing examples as that gives the task more justification to be useful and there already will be at least one reference for comparison.

Below, a description of each task highlighted with the challenges it reflects and a screenshot of the resulting GUI application in Java/Swing is given.

For a wonderful live version of the tasks where you can interact with them directly in your browser see FOAM's implementation.


Challenges: understanding the basic ideas of a language/toolkit and the essential scaffolding

The task is to build a frame containing a label or read-only textfield T and a button B. Initially, the value in T is “0” and each click of B increases the value in T by one.

Counter serves as a gentle introduction to the basics of the language, paradigm and toolkit for one of the simplest GUI applications. Thus, by comparing Counter implementations one can clearly see what basic scaffolding is needed and how the very basic features work together to build a GUI application. A good solution will have very minimal scaffolding.

Temperature Converter

Challenges: working with bidirectional dataflow, working with user-provided text input

The task is to build a frame containing two textfields TC and TF representing the temperature in Celsius and Fahrenheit, respectively. Initially, both TC and TF are empty. When the user enters a numerical value into TC the corresponding value in TF is automatically updated and vice versa. When the user enters a non-numerical string into TC the value in TF is not updated and vice versa. The formula for converting a temperature C in Celsius into a temperature F in Fahrenheit is C = (F - 32) * (5/9) and the dual direction is F = C * (9/5) + 32.

Temperature Converter increases the complexity of Counter by having a bidirectional dataflow between the Celsius and Fahrenheit value and the need to check the user input for validity. A good solution will make the bidirectional dependency very clear with minimal boilerplate code for the event-based connection of the two textfields.

Temperature Converter is inspired by the Celsius/Fahrenheit converter from the book “Programming in Scala” but it is such a widespread example — sometimes also in the form of a currency converter — that one could give a thousand references if one liked to. The same is true for the Counter task.

Flight Booker

Challenges: working with constraints

The task is to build a frame containing a combobox C with the two options “one-way flight” and “return flight”, two textfields T1 and T2 representing the start and return date, respectively, and a button B for “submitting” the selected flight. T2 is enabled iff C's value is “return flight”. When C has the value “return flight” and T2's date is strictly before T1's then B is disabled. When a non-disabled textfield T has an ill-formatted date then T is colored red and B is disabled. When clicking B a message is displayed informing the user of his selection (e.g. “You have booked a one-way flight on 04.04.2014.”). Initially, C has the value “one-way flight” and T1 as well as T2 have the same (arbitrary) date (it is implied that T2 is disabled).

The focus of Flight Booker lies on modelling constraints between widgets on the one hand and modelling constraints within a widget on the other hand. Such constraints are very common in everyday interactions with GUI applications. A good solution for Flight Booker will make the constraints clear, succinct and explicit in the source code and not hidden behind a lot of scaffolding.

Flight Booker is directly inspired by the Flight Booking Java example in Sodium with the simplification of having textfields for date input instead of specialized date picking widgets as the focus of Flight Booker is not on specialized/custom widgets.


Challenges: working with concurrency, working with competing user/signal interactions, keeping the application responsive

The task is to build a frame containing a gauge G for the elapsed time e, a label which shows the elapsed time as a numerical value, a slider S by which the duration d of the timer can be adjusted while the timer is running and a reset button R. Adjusting S must immediately reflect on d and not only when S is released. It follows that while moving S the filled amount of G will (usually) change immediately. When e ≥ d is true then the timer stops (and G will be full). If, thereafter, d is increased such that d > e will be true then the timer restarts to tick until e ≥ d is true again. Clicking R will reset e to zero.

Timer deals with concurrency in the sense that a timer process that updates the elapsed time runs concurrently to the user's interactions with the GUI application. This also means that the solution to competing user and signal interactions is tested. The fact that slider adjustments must be reflected immediately moreover tests the responsiveness of the solution. A good solution will make it clear that the signal is a timer tick and, as always, has not much scaffolding.

Timer is directly inspired by the timer example in the paper Crossing State Lines: Adapting Object-Oriented Frameworks to Functional Reactive Languages.


Challenges: separating the domain and presentation logic, managing mutation, building a non-trivial layout

The task is to build a frame containing the following elements: a textfield Tprefix, a pair of textfields Tname and Tsurname, a listbox L, buttons BC, BU and BD and the three labels as seen in the screenshot. L presents a view of the data in the database that consists of a list of names. At most one entry can be selected in L at a time. By entering a string into Tprefix the user can filter the names whose surname start with the entered prefix — this should happen immediately without having to submit the prefix with enter. Clicking BC will append the resulting name from concatenating the strings in Tname and Tsurname to L. BU and BD are enabled iff an entry in L is selected. In contrast to BC, BU will not append the resulting name but instead replace the selected entry with the new name. BD will remove the selected entry. The layout is to be done like suggested in the screenshot. In particular, L must occupy all the remaining space.

CRUD (Create, Read, Update and Delete) represents a typical graphical business application which arguably constitutes the lion's share of all GUI applications ever written. The primary challenge is the separation of domain and presentation logic in the source code that is more or less forced on the implementer due to the ability to filter the view by a prefix. Traditionally, some form of MVC pattern is used to achieve the separation of domain and presentation logic. Also, the approach to managing the mutation of the list of names is tested. A good solution will have a good separation between the domain and presentation logic without much overhead (e.g. in the form of toolkit specific concepts or language/paradigm concepts), a mutation management that is fast but not error-prone and a natural representation of the layout (layout builders are allowed, of course, but would increase the overhead).

CRUD is directly inspired by the crud example in the blog post FRP - Three principles for GUI elements with bidirectional data flow.

Circle Drawer

Challenges: implementing undo/redo functionality, custom drawing, implementing dialog control (i.e. keeping the context between successive GUI operations)

The task is to build a frame containing an undo and redo button as well as a canvas area underneath. Left-clicking inside an empty area inside the canvas will create an unfilled circle with a fixed diameter whose center is the left-clicked point. The circle nearest to the mouse pointer such that the distance from its center to the pointer is less than its radius, if it exists, is filled with the color gray. The gray circle is the selected circle C. Right-clicking C will make a popup menu appear with one entry “Adjust diameter..”. Clicking on this entry will open another frame with a slider inside that adjusts the diameter of C. Changes are applied immediately. Closing this frame will mark the last diameter as significant for the undo/redo history. Clicking undo will undo the last significant change (i.e. circle creation or diameter adjustment). Clicking redo will reapply the last undoed change unless new changes were made by the user in the meantime.

Circle Drawer's goal is, among other things, to test how good the common challenge of implementing an undo/redo functionality for a GUI application can be solved. In an ideal solution the undo/redo functionality comes for free resp. just comes out as a natural consequence of the language/toolkit/paradigm. Moreover, Circle Draw tests how dialog control*, i.e. keeping the relevant context between several successive GUI interaction steps, is achieved in the source code. Last but not least, the ease of custom drawing is tested.

* Dialog control is explained in more detail in the paper Developing GUI Applications: Architectural Patterns Revisited starting on page seven.


Challenges: implementing change propagation, customizing a widget, implementing a more authentic/involved GUI application

The task is to create a simple but usable spreadsheet application. The spreadsheet should be scrollable. The rows should be numbered from 0 to 99 and the columns from A to Z. Double-clicking a cell C lets the user change C's formula. After having finished editing the formula is parsed and evaluated and its updated value is shown in C. In addition, all cells which depend on C must be reevaluated. This process repeats until there are no more changes in the values of any cell (change propagation). Note that one should not just recompute the value of every cell but only of those cells that depend on another cell's changed value. If there is an already provided spreadsheet widget it should not be used. Instead, another similar widget (like JTable in Swing) should be customized to become a reusable spreadsheet widget.

Cells is a more authentic and involved task that tests if a particular approach also scales to a somewhat bigger application. The two primary GUI-related challenges are intelligent propagation of changes and widget customization. Admittedly, there is a substantial part that is not necessarily very GUI-related but that is just the nature of a more authentic challenge. A good solution's change propagation will not involve much effort and the customization of a widget should not prove too difficult. The domain-specific code is clearly separated from the GUI-specific code. The resulting spreadsheet widget is reusable.

Cells is directly inspired by the SCells spreadsheet example from the book “Programming in Scala”. Please refer to the book (or the implementations in this repository) for more details especially with respect to the not directly GUI-related concerns like parsing and evaluating formulas and the precise syntax and semantics of the spreadsheet language.

Dimensions of Evaluation

The following dimensions of evaluation are a subset of the dimensions from the Cognitive Dimensions of Notations (CDs) framework which is “an approach to analysing the usability of information artefacts”. CDs has been used in a variety of papers to analytically investigate the usability of programming language features or an API. Often, CDs is only applied insofar as it makes sense for a particular information artifact. That is, some of the 14 dimensions are left out and some new are added possibly. In this way, CDs can be taken as a basis for the evaluation of different solutions to the 7GUIs benchmark. The following dimensions are thus a recommended subset of CDs which turned out to work well for the analysis of two different approaches to 7GUIs.

Again, the list of dimensions is only a recommendation to make it easier to get started with an analysis between different approaches to 7GUIs. Of course, you are free to use your own criteria as you see fit.

  • Abstraction Level Types and availability of abstraction mechanisms
    • Does the system provide any way of defining new terms within the notation so that it can be extended to describe ideas more clearly? Can details be encapsulated? Does the system insist on defining new terms? What number of new high-level concepts have to be learned to make use of a system? Are they easy to use and easy to learn?
    • Each new idea is a barrier to learning and acceptance but can also make complex code more understandable. For example, Java Swing, the predecessor to JavaFX, employs a variation of the MVC design pattern in its general architecture and in particular for each of its widgets. Such being the case, there is a significant learning requirement to using the widgets reasonably well and often much boilerplate involved (“the system insists on defining new terms”) which does not pay off for simple applications. On the other hand, for very complex applications the MVC-architecture may make the code more understandable and manageable as details can be encapsulated in the new terms “Model, View and Controller”.
    • Another example is a function. A function has a name and, optionally, parameters as well as a body that returns a value following certain computational steps. A client can simply refer to a function by its name without knowing its implementation details. Accordingly, a function abstracts the computational process involved in the computation of a value. The learning barrier to the principle of a function is not great but it can still make a lot of code much more understandable by hiding unimportant details.
  • Closeness of Mapping Closeness of representations to domain
    • How closely related is the notation to the result it is describing resp. the problem domain? Which parts seem to be a particularly strange way of doing or describing something?
    • An example is the layout definition of a GUI. Languages that do not provide a way to describe the layout in a nested resp. hierarchical manner, and as such force the programmer to “linearize” the code with the introduction of meaningless temporary variables, make it hard to see how the structure of the layout definition relates to the resulting layout of the application. Not for nothing are XML-based view specifications widespread for GUI-toolkits in languages without native support for hierarchical layout expressions.
  • Hidden Dependencies Important links between entities invisible
    • Are dependencies between entities in the notation visible or hidden? Is every dependency indicated in both directions? Could local changes have confusing global effects?
    • If one entity cites another entity, which in turn cites a third, changing the value of the third entity may have unexpected repercussions. The key aspect is not the fact that A depends on B, but that the dependency is not made visible. A well-known illustration of a bad case of Hidden Dependencies is the fragile base class problem. In (complex) class hierarchies a seemingly safe modification to a base class may cause derived classes to malfunction. The IDE in general cannot help discovering such problems and only certain programming language features can help preventing them. Another example are non-local side-effects in procedures, i.e. the dependencies of a procedure with non-local side-effects are not visible in its signature.
  • Error-proneness Notation invites mistakes
    • To what extent does the notation influence the likelihood of the user making a mistake? Do some things seem especially complex or difficult (e.g. when combining several things)?
    • In many dynamic languages with implicit definitions of variables a typing error in a variable name can suddenly lead to hard to find errors as the IDE cannot always point out such an error due to the language’s dynamicity. Java’s different calling semantics for primitive and reference types may lead to mistakes if the programmer mixes them up. Implicit null-initialization of variables can lead to null-pointer exceptions if the programmer forgets to correctly initialize a variable before its use.
  • Diffuseness Verbosity of language
    • How many symbols or how much space does the notation require to produce a certain result or express a meaning? What sorts of things take more space to describe?
    • Some notations can be annoyingly long-winded, or occupy too much valuable “real-estate” within a display area. In Java before version 8 in order to express what are lambdas today anonymous classes were employed. Compared to Java 8’s lambdas these anonymous classes used to be a very verbose way of encoding anonymous functions especially when used in a callback-heavy setting like traditional GUI programming.
  • Viscosity Resistance to change
    • Are there any inherent barriers to change in the notation? How much effort is required to make a change to a program expressed in the notation?
    • A viscous system needs many user actions to accomplish one goal. Changing the return type of a function might lead to many code breakages in the call sites of said function. In such a case an IDE can be of great help. Creating a conceptual two-way data-binding by means of two callbacks involves more repetition than a more direct way to define such a dependency.
  • Commentary
    • This part is not so much a dimension but a place to mention everything else which is noteworthy and to give a conclusion. For instance, general observations that do not fit into the above dimensions, impressions during the development process, efficiency concerns of the resulting code and potential improvements can be addressed. In addition, the responsibilities of the other dimensions’ results are assigned to the paradigm, language, toolkit and the IDE.


The 7GUIs repository already contains several implementations, for example in

You can make a pull request to add implementations to the 7GUIs repository or simply add a link in the following list to the repository of your 7GUIs implementation.


Having various implementations is good but without analyses of the different approaches it is very hard to identify the pros and cons. If you happened to create a blog post, an article, a video, a short overview etc. analysing/comparing the benefits or problems of one or more 7GUIs implementations then please add your link to the following list.

Additional Tasks

There are some concepts related to the seven concepts that give this site its name. These haven't yet been formulated as complete tasks.

Related Links

  • TodoMVC is similar in spirit to 7GUIs in the sense that a task is compared between different application frameworks (in different languages and paradigms) mostly in terms of the clarity of the source code behind the resulting application but also in terms of the performance. The difference to 7GUIs is that the focus of the TodoMVC task lies on web-based (single-page and/or MV*) application frameworks whereas the 7GUIs tasks focuse more on traditional GUI challenges which are still present in web-based GUIs nonetheless.
  • Rosettacode's GUI category. Rosettacode is a general programming chrestomathy site with a category for GUI tasks. However, these tasks focus mainly on very specifics of a toolkit and not on fundamental GUI programming challenges.
  • Layout Manager Showdown. The author stumbled upon a complex layout task that could not be fulfilled by his GUI builder of choice. This task was used to compare different layout managers in terms of code clarity. The difference to 7GUIs is that complex layouts are but one GUI challenge (which is already somewhat reflected in 7GUIs' CRUD task) and not a mostly “complete” set of GUI challenges.
  • Flux Challenge is a “A frontend challenge to test UI architectures and solutions” in the same vein as TodoMVC. The main challenge lies in handling tricky asynchrony elegantly which I find interesting since I feel 7GUIs lacks in this regard.