# Processing the RICO Dataset

> Generating training data for a link classifier.

This data processing is part of a master thesis project titled "Assignment-Based Link Optimization in GUI Prototyping Using Incremental Supervised Classification" by [Christoph A. Johns](mailto:christophjohns@aalto.fi?subject=[GitHub]%20Suggested%20Links%Figma%Plugin) at German Research Center for Artificial Intelligence (DFKI) and Aalto University.
The project is supervised by Michael Barz and Antti Oulasvirta.

## Outline

In this file, we…:

1. Load and view the [RICO](https://interactionmining.org/rico) data and its structure.
2. Generate feature vectors for the [link classifier](https://github.com/christophajohns/suggested-links-classifier)
3. Output a file with training data for the link classifier

The training data has to have the following structure (formatted as JSON):

```JSON
{
    "data": [
        {
            "link": {
                "source": {
                    "id": "23:10",
                    "color": {
                        "r": 0.17,
                        "g": 0.61,
                        "b": 0.86
                    },
                    "characters": "More details"
                },
                "target": {
                    "id": "23:12",
                    "topics": [
                        "design",
                        "technology",
                        "shape",
                        "detail",
                        "classic"
                    ]
                },
                "context": [
                    [
                        "project",
                        "pattern",
                        "patent",
                        "invention",
                        "redesign"
                    ],
                    [
                        "layout",
                        "intend",
                        "create",
                        "contrive",
                        "purpose",
                    ],
                    [
                        "design",
                        "technology",
                        "shape",
                        "detail",
                        "classic"
                    ]
                ]
            },
            "is_link": true
        }
    ]
}
```

## Loading and Viewing Data

First, we load the RICO data and look at its structure, starting with the view hierarchies.

In [6]:
import pandas as pd
import os, json
from tqdm import tqdm

RICO_DATA = 'rico-data'
VIEW_HIERARCHIES = 'combined'
view_hierarchies_folder = f'{RICO_DATA}/{VIEW_HIERARCHIES}/'

json_file_paths = [f'{view_hierarchies_folder}/{file_name}' for file_name in os.listdir(view_hierarchies_folder) if file_name.endswith('.json')]

view_hierarchies_dfs = []
for json_file_path in tqdm(json_file_paths):
    with open(json_file_path) as json_file:
        view_hierarchy_data = json.load(json_file)
        view_hierarchy_df = pd.json_normalize(view_hierarchy_data)
        view_hierarchies_dfs.append(view_hierarchy_df)

view_hierarchies = pd.concat(view_hierarchies_dfs)
view_hierarchies.head()

100%|██████████| 66261/66261 [08:23<00:00, 131.68it/s] 


Unnamed: 0,activity_name,is_keyboard_deployed,request_id,activity.root.scrollable-horizontal,activity.root.draw,activity.root.ancestors,activity.root.clickable,activity.root.pressed,activity.root.focusable,activity.root.long-clickable,...,activity.root.selected,activity.root.scrollable-vertical,activity.root.children,activity.root.adapter-view,activity.root.abs-pos,activity.root.pointer,activity.root.class,activity.root.visible-to-user,activity.added_fragments,activity.active_fragments
0,com.siplay.tourneymachine_android/com.siplay.t...,False,29,False,True,"[android.widget.FrameLayout, android.view.View...",False,not_pressed,False,False,...,False,False,"[{'scrollable-horizontal': False, 'draw': True...",False,True,dc5e2,com.android.internal.policy.PhoneWindow$DecorView,True,[],[]
0,com.ebates/com.ebates.activity.DrawerActivity,False,5576,False,True,"[android.widget.FrameLayout, android.view.View...",False,not_pressed,False,False,...,False,False,"[{'scrollable-horizontal': False, 'draw': True...",False,True,758eb5f,com.android.internal.policy.PhoneWindow$DecorView,True,[],[]
0,com.adda247.app/com.adda247.modules.home.HomeA...,False,121,False,True,"[android.widget.FrameLayout, android.view.View...",False,not_pressed,False,False,...,False,False,"[{'scrollable-horizontal': False, 'draw': True...",False,True,33a6b88,com.android.internal.policy.PhoneWindow$DecorView,True,[],[]
0,com.go.abclocal.kabc.android.weather/com.wdtin...,False,510,False,True,"[android.widget.FrameLayout, android.view.View...",False,not_pressed,False,False,...,False,False,"[{'scrollable-horizontal': False, 'draw': True...",False,True,3301432,com.android.internal.policy.PhoneWindow$DecorView,True,[],[]
0,kik.android/kik.android.chat.fragment.SimpleFr...,False,927,False,True,"[android.widget.FrameLayout, android.view.View...",False,not_pressed,False,False,...,False,False,"[{'scrollable-horizontal': False, 'draw': True...",False,True,2bdbd3d,com.android.internal.policy.PhoneWindow$DecorView,True,[],[]


In [7]:
# Save view hierarchies data
DATA = 'data'
VIEW_HIERARCHIES_FILE_NAME = 'view_hierarchies.csv'
view_hierarchies_df_path = f'{DATA}/{VIEW_HIERARCHIES_FILE_NAME}'
view_hierarchies.to_csv(view_hierarchies_df_path)
print('Saved view hierarchies file.')

Saved view hierarchies file.


In [8]:
view_hierarchies.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 66261 entries, 0 to 0
Data columns (total 26 columns):
 #   Column                               Non-Null Count  Dtype 
---  ------                               --------------  ----- 
 0   activity_name                        66261 non-null  object
 1   is_keyboard_deployed                 66261 non-null  bool  
 2   request_id                           66260 non-null  object
 3   activity.root.scrollable-horizontal  66261 non-null  bool  
 4   activity.root.draw                   66261 non-null  bool  
 5   activity.root.ancestors              66261 non-null  object
 6   activity.root.clickable              66261 non-null  bool  
 7   activity.root.pressed                66261 non-null  object
 8   activity.root.focusable              66261 non-null  bool  
 9   activity.root.long-clickable         66261 non-null  bool  
 10  activity.root.enabled                66261 non-null  bool  
 11  activity.root.bounds                 66261 no

## Getting the text elements in a UI

Based on thie view hierarchy data of one file (representing one UI), we identify the text elements via their class.

In [44]:
from pprint import pprint

RICO_DATA = 'rico-data'
VIEW_HIERARCHIES = 'combined'
EXAMPLE_FILE_NAME = '9999.json'
example_view_hierarchy_path = f'{RICO_DATA}/{VIEW_HIERARCHIES}/{EXAMPLE_FILE_NAME}'

with open(example_view_hierarchy_path) as json_file:
    example_view_hierarchy = json.load(json_file)

pprint(example_view_hierarchy)

{'activity': {'active_fragments': [],
              'added_fragments': [],
              'root': {'abs-pos': True,
                       'adapter-view': False,
                       'ancestors': ['android.widget.FrameLayout',
                                     'android.view.ViewGroup',
                                     'android.view.View',
                                     'java.lang.Object'],
                       'bounds': [0, 0, 1440, 2392],
                       'children': [{'abs-pos': True,
                                     'adapter-view': False,
                                     'ancestors': ['android.view.ViewGroup',
                                                   'android.view.View',
                                                   'java.lang.Object'],
                                     'bounds': [0, 0, 1440, 2392],
                                     'children': [{'abs-pos': True,
                                                   'adapter-view': Fal

In [121]:
def get_base_elements(activity_root):

    base_elements = []

    stack = []
    stack.append(activity_root)

    while len(stack) > 0:
        top_element = stack.pop()
        if top_element is not None:
            if 'children' not in top_element:
                base_elements.append(top_element)
            else:
                for child in top_element['children']:
                    stack.append(child)
    
    return base_elements

base_elements = get_base_elements(example_view_hierarchy['activity']['root'])
pprint(base_elements)

[{'abs-pos': True,
  'adapter-view': False,
  'ancestors': ['android.view.View', 'java.lang.Object'],
  'bounds': [0, 425, 0, 425],
  'class': 'android.widget.TextView',
  'clickable': False,
  'content-desc': [None],
  'draw': False,
  'enabled': True,
  'focusable': False,
  'focused': False,
  'font-family': 'default',
  'long-clickable': False,
  'package': 'com.ter.androidapp',
  'pointer': '734f6b4',
  'pressed': 'not_pressed',
  'rel-bounds': [0, 0, 0, 0],
  'resource-id': 'com.ter.androidapp:id/btnGoToSettings',
  'scrollable-horizontal': False,
  'scrollable-vertical': False,
  'selected': False,
  'text': 'PARAMÈTRES',
  'visibility': 'visible',
  'visible-to-user': False},
 {'abs-pos': True,
  'adapter-view': False,
  'ancestors': ['android.view.View', 'java.lang.Object'],
  'bounds': [0, 425, 0, 425],
  'class': 'android.widget.TextView',
  'clickable': False,
  'content-desc': [None],
  'draw': False,
  'enabled': True,
  'focusable': False,
  'focused': False,
  'font-fam

In [122]:
def get_text_elements(activity_root):
    base_elements = get_base_elements(activity_root)
    text_elements = [element for element in base_elements if 'text' in element and len(element['text']) > 0]
    return text_elements

def get_relevant_features(text_element):
    relevant_keys = ['text', 'clickable', 'bounds']
    return {key: value for key, value in text_element.items() if key in relevant_keys}

text_elements = get_text_elements(example_view_hierarchy['activity']['root'])
pprint([get_relevant_features(text_element) for text_element in text_elements])

[{'bounds': [0, 425, 0, 425], 'clickable': False, 'text': 'PARAMÈTRES'},
 {'bounds': [0, 425, 0, 425],
  'clickable': False,
  'text': 'Confirmez votre train :'},
 {'bounds': [70, 815, 1370, 983], 'clickable': True, 'text': 'Rechercher'},
 {'bounds': [425, 665, 1370, 787], 'clickable': True, 'text': '7 février 2017'},
 {'bounds': [70, 681, 213, 762], 'clickable': False, 'text': 'DATE'},
 {'bounds': [70, 530, 414, 611], 'clickable': False, 'text': 'N° DU TRAIN'},
 {'bounds': [967, 284, 1433, 410], 'clickable': True, 'text': 'TRAINS'},
 {'bounds': [487, 284, 953, 410], 'clickable': True, 'text': 'ITINÉRAIRES'},
 {'bounds': [7, 284, 473, 410], 'clickable': True, 'text': 'GARES'},
 {'bounds': [0, 1264, 980, 1432], 'clickable': True, 'text': 'A propos'},
 {'bounds': [0, 1092, 980, 1260],
  'clickable': True,
  'text': 'Mentions légales'},
 {'bounds': [0, 916, 980, 1084],
  'clickable': True,
  'text': "Partager l'application"},
 {'bounds': [0, 769, 980, 898],
  'clickable': True,
  'text': 

In [123]:
if len(text_elements) > 0:
    pprint(text_elements[0])

{'abs-pos': True,
 'adapter-view': False,
 'ancestors': ['android.view.View', 'java.lang.Object'],
 'bounds': [0, 425, 0, 425],
 'class': 'android.widget.TextView',
 'clickable': False,
 'content-desc': [None],
 'draw': False,
 'enabled': True,
 'focusable': False,
 'focused': False,
 'font-family': 'default',
 'long-clickable': False,
 'package': 'com.ter.androidapp',
 'pointer': '734f6b4',
 'pressed': 'not_pressed',
 'rel-bounds': [0, 0, 0, 0],
 'resource-id': 'com.ter.androidapp:id/btnGoToSettings',
 'scrollable-horizontal': False,
 'scrollable-vertical': False,
 'selected': False,
 'text': 'PARAMÈTRES',
 'visibility': 'visible',
 'visible-to-user': False}


## Identifying links

Next, we determine the targets of those text elements that act as links.
For this, we load the interaction traces treating every first touch points as identifying the source element and the next view as the target.

In [124]:
RICO_DATA = 'rico-data'
INTERACTION_TRACES = 'filtered_traces'
EXAMPLE_APP = 'aero.sita.lab.resmobileweb.android.mh'
example_interaction_trace_path = f'{RICO_DATA}/{INTERACTION_TRACES}/{EXAMPLE_APP}/trace_0'
gestures_path = f'{example_interaction_trace_path}/gestures.json'

with open(gestures_path) as json_file:
    example_gestures = json.load(json_file)

pprint(example_gestures)

{'370': [[0.8251563251563252, 0.35173160173160173]],
 '488': [[0.373015873015873, 0.22402597402597402]],
 '526': [[0.5634920634920635, 0.8982683982683982]],
 '614': [[0.16907166907166907, 0.3365800865800866]],
 '642': [[0.3460798460798461, 0.13636363636363635],
         [0.34415584415584416, 0.13636363636363635]],
 '683': [[0.4134199134199134, 0.25757575757575757],
         [0.41534391534391535, 0.25865800865800864],
         [0.4172679172679173, 0.2597402597402597]],
 '688': [[0.09018759018759019, 0.07034632034632035]],
 '700': [[0.25757575757575757, 0.7867965367965368],
         [0.2594997594997595, 0.7867965367965368],
         [0.26142376142376145, 0.7867965367965368],
         [0.26334776334776333, 0.7867965367965368],
         [0.26527176527176527, 0.7867965367965368],
         [0.26527176527176527, 0.7857142857142857],
         [0.2671957671957672, 0.7857142857142857],
         [0.2671957671957672, 0.7846320346320347],
         [0.2691197691197691, 0.7846320346320347],
         

In [239]:
# from PIL import Image, ImageDraw, ImageColor
# from collections import Counter
# from sklearn.cluster import KMeans
from copy import copy
import cv2
import numpy as np
from math import floor, ceil

# Construct positive examples from interaction trace
# Construct negative examples from non-clickable elements
def touchpoint_on_element(touchpoint, element):
    x, y = touchpoint
    bounds = element['bounds']
    x_top_left, y_top_left, x_bottom_right, y_bottom_right = bounds
    return x > x_top_left and y > y_top_left and x < x_bottom_right and y < y_bottom_right

def has_additional_screen(current_index, gestures):
    return len(gestures.keys()) > current_index + 1

def is_click(gesture):
    return len(gesture[1]) == 1

def get_targets(interaction_trace_path):
    view_hierarchies_folder = f'{interaction_trace_path}/view_hierarchies'
    targets = {}
    for file_name in os.listdir(view_hierarchies_folder):
        json_file_path = f'{view_hierarchies_folder}/{file_name}'
        with open(json_file_path) as json_file:
            view_hierarchy_data = json.load(json_file)
            if view_hierarchy_data is not None:
                text_elements = get_text_elements(view_hierarchy_data['activity']['root'])
                screen_id = file_name.split('.')[0]
                target = {
                    'id': screen_id,
                    'topics': [text_element['text'] for text_element in text_elements]
                }
                targets[screen_id] = target
    return targets

def preprocess_image(raw_image):
    image = cv2.resize(raw_image, (18, 32), interpolation=cv2.INTER_NEAREST)                                          
    #image = image.reshape(image.shape[0]*image.shape[1], 3)
    return image

# def get_element_colors(element, image, view_hierarchy):
#     screen_width, screen_height = view_hierarchy['activity']['root']['bounds'][2:]
#     image_height, image_width = image.shape[:2]
#     bounds = copy(element['bounds'])
#     for index, position in enumerate(bounds):
#         if index % 2:
#             bounds[index] = position * image_width/screen_width
#         else:
#             bounds[index] = position * image_height/screen_height
#     clf = KMeans(n_clusters=2)
#     color_labels = clf.fit_predict(image)
#     center_colors = clf.cluster_centers_ 
#     counts = Counter(color_labels)
#     sorted_colors = [center_colors[i] for i in counts.keys()]
#     return sorted_colors

def get_color(element, image, view_hierarchy):
    colors = get_area_colors(element, image, view_hierarchy)
    text_color = colors[-1]
    # print('text_color', text_color)
    # print('colors', colors)
    return {
        'r': text_color[0] / 255,
        'g': text_color[1] / 255,
        'b': text_color[2] / 255
    }

def get_area_colors(element, image, view_hierarchy, show_image=False):
    screen_width, screen_height = view_hierarchy['activity']['root']['bounds'][2:]
    image_height, image_width = image.shape[:2]
    bounds = copy(element['bounds'])
    left = floor(bounds[0] * image_width/screen_width)
    upper = floor(bounds[1] * image_height/screen_height)
    right = ceil(bounds[2] * image_width/screen_width)
    lower = ceil(bounds[3] * image_height/screen_height)
    bounds = left, upper, right, lower
    # for index, position in enumerate(bounds):
    #     if index % 2:
    #         bounds[index] = floor(position * image_width/screen_width)
    #     else:
    #         bounds[index] = floor(position * image_height/screen_height)
    # print('image_sample', image[:10])
    element_area = image[upper:lower, left:right, :]
    # print('element_area_sample', element_area[:10])
    # image = image.reshape(image.shape[0]*image.shape[1], 3)
    colors, counts = np.unique(element_area.reshape(element_area.shape[0]*element_area.shape[1], 3), axis=0, return_counts=True)
    count_sort_indices = np.argsort(-counts)
    sorted_colors = colors[count_sort_indices]
    # if len(colors) < 5:
    #     print('element_bounds', element['bounds'])
    #     print('bounds', bounds)
    #     print('element_area', element_area)
    #     print('colors', colors)
    #     print('counts', counts)
    #     print('sorted_colors', sorted_colors)
    # print(bounds, element_area, colors, counts, sorted_colors)
    return sorted_colors

    # img = image.copy()
    # element_area = img.crop(bounds)
    # if show_image:
    #     element_area.show()
    #     draw = ImageDraw.Draw(image)
    #     left, upper, right, lower = bounds
    #     coords = [(left, upper), (right, upper), (right, lower), (left, lower), (left, upper)]
    #     draw.line(coords, fill=ImageColor.getrgb('red'), width=4)
    #     image.show()
    # colors = element_area.getcolors(maxcolors=32768)
    # return colors
        

def source_element_on_screen(element, image, view_hierarchy):
    right, lower = element['bounds'][2:]
    screen_width, screen_height = view_hierarchy['activity']['root']['bounds'][2:]
    if not all([position > 0 for position in element['bounds']]): return False
    if not (right < screen_width and lower < screen_height): return False
    if not element['visible-to-user']: return False
    colors = get_area_colors(element, image, view_hierarchy)
    if not len(colors) > 1: return False
    return True
            

def get_links_from_trace(trace):
    links = []
    gestures_path = f'{trace}/gestures.json'
    with open(gestures_path) as json_file:
        gestures = json.load(json_file)
    gesture_items = gestures.items()
    clicks = [gesture for gesture in gesture_items if is_click(gesture)]
    targets = get_targets(trace)
    for index, (screen_id, coordinate_pairs) in enumerate(clicks):
        first_touchpoint = coordinate_pairs[0]
        x_scaled, y_scaled = first_touchpoint
        view_hierarchy_path = f'{trace}/view_hierarchies/{screen_id}.json'
        with open(view_hierarchy_path) as json_file:
            view_hierarchy = json.load(json_file)
        if view_hierarchy is not None:
            text_elements = get_text_elements(view_hierarchy['activity']['root'])
            bounds = view_hierarchy['activity']['root']['bounds']
            x = bounds[2] * x_scaled
            y = bounds[3] * y_scaled
            touchpoint = [x,y]
            ui_image_path = f'{trace}/screenshots/{screen_id}.jpg'
            # pil_image = Image.open(ui_image_path)
            image = cv2.imread(ui_image_path)
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            modified_image = preprocess_image(image)
            for text_element in text_elements:
                if has_additional_screen(index, gestures):
                    target_screen_id = list(gesture_items)[index+1][0]
                    if target_screen_id in targets and source_element_on_screen(text_element, modified_image, view_hierarchy):
                        target = targets[target_screen_id]
                        link = {
                            'source': {
                                'characters': text_element['text'],
                                'parentId': screen_id,
                                'color': get_color(text_element, modified_image, view_hierarchy)
                            },
                            'target': target,
                        }
                        if touchpoint_on_element(touchpoint, text_element):
                            link['isLink'] = True
                        if not text_element['clickable']:
                            link['isLink'] = False
                        links.append(link)
    return links


links = get_links_from_trace(example_interaction_trace_path)
print(len(links))
pprint(links[0])

105
{'source': {'characters': '0',
            'color': {'b': 0.8, 'g': 0.8, 'r': 0.8},
            'parentId': '370'},
 'target': {'id': '488',
            'topics': ['About This App',
                       'Rate the Malaysia Airlines App',
                       'Privacy Policy',
                       'Terms & Conditions',
                       'Legal',
                       'Call Centre',
                       '120',
                       'Contact Us',
                       'Track Baggage',
                       'MHupgrade',
                       'Booking.com',
                       'Schedule',
                       'Deals of the Day',
                       'Book Flight',
                       'Trips / Check-in',
                       'Home',
                       'GRACE CHAN',
                       'Find Flights',
                       'Find Flights',
                       '0',
                       '(<2yrs)',
                       'Infants',
                   

In [240]:
def get_links_from_app(application_name):
    application_folder = f'{RICO_DATA}/{INTERACTION_TRACES}/{application_name}'
    traces = [f'{application_folder}/{child}' for child in os.listdir(application_folder) if 'trace' in child]
    traces_with_more_than_one_screen = [trace for trace in traces if len(os.listdir(trace)) > 1]
    links = []
    for trace in traces_with_more_than_one_screen:
        trace_links = get_links_from_trace(trace)
        links = links + trace_links
    return links

links = get_links_from_app(EXAMPLE_APP)
pprint(links[:3])

[{'source': {'characters': '0',
             'color': {'b': 0.8, 'g': 0.8, 'r': 0.8},
             'parentId': '370'},
  'target': {'id': '488',
             'topics': ['About This App',
                        'Rate the Malaysia Airlines App',
                        'Privacy Policy',
                        'Terms & Conditions',
                        'Legal',
                        'Call Centre',
                        '120',
                        'Contact Us',
                        'Track Baggage',
                        'MHupgrade',
                        'Booking.com',
                        'Schedule',
                        'Deals of the Day',
                        'Book Flight',
                        'Trips / Check-in',
                        'Home',
                        'GRACE CHAN',
                        'Find Flights',
                        'Find Flights',
                        '0',
                        '(<2yrs)',
                        'Infants

In [241]:
def get_links():
    interaction_traces_folder = f'{RICO_DATA}/{INTERACTION_TRACES}'
    links = []
    apps = [f for f in os.listdir(interaction_traces_folder) if not f.startswith('.')]
    for index, app in tqdm(enumerate(apps), total=len(apps)):
        app_links = get_links_from_app(app)
        links = links + app_links
        if index % 500 == 0:
            with open(f'{DATA}/links.json', 'w') as f:
                json.dump({"links": links}, f)
    return links

links = get_links()
pprint(links[:3])

100%|██████████| 9384/9384 [26:19<00:00,  5.94it/s]   

[{'source': {'characters': 'Terms & Conditions.',
             'color': {'b': 0.0784313725490196,
                       'g': 0.2980392156862745,
                       'r': 0.4117647058823529},
             'parentId': '31'},
  'target': {'id': '51', 'topics': []}},
 {'source': {'characters': 'By continuing you accept Privacy Policy,',
             'color': {'b': 0.30196078431372547,
                       'g': 0.5764705882352941,
                       'r': 0.7607843137254902},
             'parentId': '31'},
  'target': {'id': '51', 'topics': []}},
 {'isLink': False,
  'source': {'characters': 'Select Position',
             'color': {'b': 1.0, 'g': 1.0, 'r': 1.0},
             'parentId': '297'},
  'target': {'id': '297',
             'topics': ['Next',
                        'Skip',
                        'Back',
                        'Select Position',
                        'Notification Position',
                        'Show notification on outgoing calls',
             




In [242]:
with open(f'{DATA}/links.json', 'w') as f:
    json.dump({"links": links}, f)