Skip to content

CERN'15 Hackathon (done)

Dennis Blommesteijn edited this page Sep 18, 2015 · 1 revision

8-9 July 2015

Location & Room

CERN, Geneva

Room 600-R-002 (Building 600) on the 8th of July.

Room 513-R-068 (Building 513) on the 9th of July morning.

Room 600-R-001 (Building 600) on the 9th of July afternoon.

Agenda

Wednesday 8 July 12:00 - 18:00: Discussion of the new architecture for B2SHARE 2.0

Thursday 9 July 9:00 - 16:30: Start developing B2SHARE 2.0

Topics

  • B2SHARE 2.0

    • layered architecture
    • backend API sketch
    • UI remake (tech, design)
    • domain metadata schema
  • Rediscuss the git development workflow, write down the correct workflow and make sure we all understand and respect it

Participants

  • Nicolas
  • Carl Johan
  • Sarah
  • Lassi
  • Dennis
  • Emanuel

Decisions taken for B2Share 2.0

We will switch to Invenio 3.0

Why:

  1. Now is the right time to do it. Invenio3 will be a big refactoring of Invenio 2. It will be a lot cleaner, without the legacy modules and with clearly defined APIs. As we refactor a lot of things for B2Share 2.0 it's better to switch to it now.
  2. It will have a new Json Schema. We will need this new schema which is much better than MARC schema.

How:

We will switch to master branch of Invenio for this summer, separating at the same time Invenio and B2Share. Invenio 3 should be released by the end of august.

Notes:

  • We noted that we want to be able to modify the json schema easily in the database. Not only on files. The Json schemas need to be on files too because of the way it works (inter-file references). Maybe we can generate the files from the database. FUTURE WORK.

Docker and Pypi for release and deloyment of B2Share 2.0

  1. We will use Docker for release and production.
  2. Nicolas will also release on Pypi for bare-metal deployment.

Why:

Because docker simplifies things a lot and Pypi enables to have B2Share versioned as an official Python package.

We will use the AngularJS UI for the front end

Why:

  1. AngularJS has proven to be a good framework. We already have a prototype which works quite well.
  2. It will enable us to make a great REST API.

How:

The AngularJS UI will use the REST API which we will implement in our overlay of Invenio.

We will use Invenio Submit module instead of our own

Why:

  1. Invenio Submit module has a lot of features we want for B2Share version 2.0 or later. Examples: the generation of thumbnails for pictures, the generation of multiple resolutions of submitted video files, curation/validation workflows...
  2. We don't have enough workforce to do all these features ourselves. We would have to choose which one we want to implement.

How:

  1. The current B2Deposit module contains a lot of things which could be in different modules. The first step is to split b2deposit.
  2. Then we remove b2deposit and refactor what need to be refactored in order to use Invenio deposit module.

We will have multiple repositories on Github and use Invenio as a dependency.

Why:

  • We have right now Invenio source code copied in B2Share. This has proven to be impractical for integrating new versions of Invenio. We will have a for of Invenio as it is hard to contribute as we don't have a fork of Invenio.

We will also have a repository for the new UI.

How:

Nicolas will:

  1. create a fork of Invenio.
  2. create a branch "evolution" in B2Share which will contain the future 2.0 version.
  3. add the modifications from B2Share to Invenio just for making B2share work at the beginning.
  4. remove Invenio code from B2Share.

Denis will:

  • Create a repository for the new AngularJS UI.

We will have a temporary branching while we work on 1.6 and 2.0

Why

1.6 and 2.0 will not be compatible code-wise. Thus we will have to have separate branches for them.

How

  1. Nicolas merges ASAP Invenio 2.0 maintenance branche in B2Share master
  2. Nicolas creates ASAP the evolution branch from master
  3. On the 10th of August branch master is renamed as 1.6 and then the branch evolution is renamed as master. We have to change the "default" branch to the new master too.

Note

This is only a one time solution. In the future we decided that we will still branch new releases from master branch. This branching will be done AT CODE FREEZE as suggested by Denis.

We will propose a REST API ASAP to the REST API committee

Why

  • There will be an external service providing an HTTP REST API for all EUDAT services. The REST API committee is starting to work on it now and we are still free to make suggestions.
  • Also we are the only one to have a REST API and it has been suggested by the committee itself that they might use our API.

How

  1. Define the features we need to have access to though the REST API.
  2. Define the endpoints. There is an agreement that merging /api/depositions and api/records in one endpoint would make things simpler. Also, general terms should be in plural, e.g. records, not record. In the current version we mix record/records and deposit/deposits.
  3. Define parameters.

We agreed on a record/deposit state-machine

Why

There where multiple issues. We were not sure of

  • when we wanted PID to be created and assigned.
  • if embargo needed to have its own state.
  • if we wanted access control at all levels.
  • when was the record/deposition editable and when not. Did it depend only of the public access?

The final diagram

      Start
     +-+++-----------------------------------------------------------+
     | +++                                                           |
     | +++  create   +-------+       delete     +---------+          |
     | +++---------> | Draft +----------------->| Deleted |          |
     |               +-------+ <-+              +---------+          |
     |                ^ |  |     | save             ^   ^            |
     |                | |  +-----+                  |   |            |
     |                | |                           |   |            |
     |         reject | | submit                    |   |            |
     |                | v                           |   |            |
     |                +-----------+         delete  |   |            |
     |                | Submitted | ----------------+   |            |
     |                +-----------+                     |            |
     |                  |                               |            |
     |                  | accept                        |            |
     |                  |                               |            |
     |                  v                               |            |
     |            +----------+              delete      |            |
     |            | Released | -------------------------+            |
     |            +----------+                                       |
     |                 |                                             |
     |                 v                                             |
     +---------------++++--------------------------------------------+
                     ++++                                             
                     End                                              

The naming is not decided yet. This diagram is only here to show the different steps we agreed on.

  • Draft:
  • no PID
  • Editable without creating new versions (we might just add some "history" of modifications later on to make undo easy)
  • Private to the user (we might one day enable multiple people to work on it but this is not a current need)
  • Submitted:
  • Here we use Invenio Submit module workflows. It would enable a community administrator to specify a list of actions which have to be performed before the record is accepted. It can for example be spam filtering, curator editing...
  • no PID
  • Editable without creating new versions.
  • Visible to people in the workflows only.
  • Released:
  • has a PID
  • Metadata can be modified by submitting a JSON patch which has to go though all previous steps (Draft, submitted) and thus be validated by admin/curator if necessary. The current record is still accessible in the mean time. Once the JSON patch is accepted, a new version of the record is added to the existing record. It keeps its PID. We would only allow metadata modification even though Invenio also allows files versioning.
  • complex access control policies possible.
  • Deleted:
  • this is mostly an implementation detail. We don't delete the record/deposition but only mark it as deleted.

When a user wants to create a new version of a record with different files, he has to create a new record. Thus we might have two buttons on a released record: "edit metadata" and "submit new record from this record"

Records in embargo mode are in Released state with ACL giving new access permissions after some time.

Notes

We don't plan to have workflows with curators for B2Share 2.0. This will be added later on. For 2.0 the records will be automatically accepted when they are submitted.

Unify terms deposition and record as record only

Why

Having only one name will simplify greatly the REST API and the UI. We will be able to search for all records and filter on their state.

Denis explained that for librarians record is comprehensible. Thus we agreed that it could be used as the official name.

What we are unsure of right now

We might have private endpoints for the AngularJS UI in addition to the REST API endpoints.

Next steps

Assigned Task

  1. Lassi
  • IRods integration
  1. Emanuel
  • ?
  1. Dennis
  • AngularJS UI
  1. Sarah
  • REST API development
  1. Nicolas
  • Merge Invenio maintenance branch
  • Separate Invenio from B2Share
Clone this wiki locally