Skip to content

4. Excel2SBOL Module and Repository Architecture

JMante1 edited this page Jun 11, 2021 · 5 revisions

Table of Contents

Repository Architecture

This repository contains the excel2sbol module, resources to use it (such as templates), and the tests for all of the functions it contains.

Repository Automation and Secrets

Repository Secrets

  • PYPI_USERNAME
  • PYPI_PASSWORD

GitHub Actions

  • linting: runs on pull request to test flake8 compliance of the code
  • testing: runs on pull request to run the suite of pytest tests
  • python-publish: runs on release creation to push a new package to pypi

File Structure

  • Home
    • excel2sbol: main project folder
      • utils: the main project code
        • .py: Main project code
        • template_constants.txt: contains the constants relating to each spreadsheet template
      • tests: pytest
        • test_files: data files used for testing
        • test_*.py: files containing pytest code
      • resources:
        • templates: Excel templates for use with the excel2sbol converter
        • taxonomy_scrapers: files related to creating the excel ontologies used in the converter templates
      • README: required for packaging of the excel2sbol library
      • setup.py: Used for pip installation of the package
    • images: contains images for the read me
    • .github: contains issue templates and github actions
      • workflows: github actions
        • linting: runs on pull request to test flake8 compliance of the code
        • python-publish: runs on release creation to push a new package to pypi
        • testing: runs on pull request to run the suite of pytest tests
      • ISSUE_TEMPLATE: bug_report, documentation_issue, and feature_request
    • requirements.txt: python dependencies
    • README.md: Creates the quick guide on github
    • LICENSE: BSD-3-Clause
    • .gitignore: Used for github syncing

Module Architecture

Excel-to-SBOL works by splitting the Library spreadsheet into three parts:

  1. Overview information (e.g. Collection Name, Date Created, and Authors)
  2. Design Description: The overview of the design collection
  3. Part table: The table of parts provided

Excel Sheet Parts

Each of the three spreadsheet parts is processed individually.

The part table is the most complex as it requires the column_definitions sheet to process and the other ontology sheets.

The architecture is:

  • converter_function.py
    • Function: converter, relies on all the above functions to go from a spreadsheet, process all the parts individually, and output an SBOL file.
    • Dependency: helper_functions.py, column_functions.py, initialise_functions.py
  • initialise_functions.py
    • Function: read_in_sheet, reads in the excel spreadsheet and splits it into the above mentioned three parts for further processing.
    • Class: table, takes the dictionary produced by read_in_sheet and calls the column class method to create a column object for every column
    • Dependency: column_functions.py
  • column_functions.py
    • Class: sbol_methods, a class which is used to implement a switch statement to process each of the excel columns. For example if sbh_sourceOrganism is present in the column_definitions sheet then sbol_methods.sbh_sourceOrganism() will automatically be called and used to transform the data as needed.
    • Class: column, creates a column object to make handling data associated with the column easier. This includes the creation of a lookup dictionary if specified in the dictionary it takes as input.
    • Dependency: helper_functions.py
  • helper_functions.py
    • Function: col_to_num, converts excel column names like AA to zero indexed numbers like 26
    • Function: check_name, ensures that a string is alphanumeric and contains no special characters (including spaces) apart from '_'
    • Function: truthy_strings, converts several different kinds of input (e.g.: 'True', True, 1, '1', 'tRue') to the boolean True or False

Dependency structure

This graphic shows the dependency structure of the different functions used in the module. The arrows indicate dependency and the colours indicate the file they can be found in (see the key in the bottom right-hand corner). Dependency Structure