Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SIP creation panel #30

Merged
merged 87 commits into from
May 30, 2014
Merged

Add SIP creation panel #30

merged 87 commits into from
May 30, 2014

Conversation

Hwesta
Copy link
Contributor

@Hwesta Hwesta commented Apr 17, 2014

Add the SIP creation panel, which allows arranging files from transfers into SIPs.  Partially described by https://www.archivematica.org/wiki/Transfer_and_SIP_creation#Mockups

  • arrange files from Transfers to a new SIP by storing new paths in SIPArrange table.  On create SIP, POST to SS to move from old path to specified path
  • SIP arrangement panel
    • left side shows files from backlogged transfers, populated by the backlog search from ES
    • right side shows arranged SIPs, created by creating directories, dragging transfer files over and rearranging them. Any folder can be selected to create a new SIP
    • logs and metadata cannot be copied over, though they can be viewed
  • delete transfers whose contents have been completely stored

Other Features:

  • run syncdb by dev-helper
    • to create SIPArrange table
  • failed SIP hook
    • sets files in failed SIP available to be arranged again
  • backlog now uses SS (Enabled backlogged transfers archivematica-storage-service#9)
  • added SIP restructure for files with UUIDs after failing verification, for newly created arranged SIPs that don't have the expected structure
  • separation of download from SS vs download from local FS (ie SIP Arrangement)
  • cleanup
    • restructure for compliance to files that actually called them
    • required directories and optional files for SIP verification and restructuring moved to common
    • SS url generation centralized to storageService.py helper module
    • remove now unneeded backlog search page - backlog search populates the SIP arrangement panel

Requires artefactual/archivematica-storage-service#9

Still needs more history cleanup.

@mistydemeo
Copy link
Contributor

You already know this, so just marking down so I remember: the failed SIP hook should probably take https://projects.artefactual.com/issues/6636 into account. I imagine weird things would happen if a rejected DIP ends up being slurped back as a SIP for arrangement?

@mistydemeo
Copy link
Contributor

backlog now uses SS (artefactual/archivematica-storage-service#9)

Yay! More and more stuff that should be using the SS is using it :D

import sys

path = '/usr/share/archivematica/dashboard'
if path not in sys.path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen this pattern a few times: just out of curiosity, do we anticipate these client scripts being imported by something that is already using dashboard paths?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the sys.path is often cluttered with multiple instances of '/usr/share/archivematica/dashboard', '/usr/lib/archivematica/archivematicaCommon' etc having been added, and was hoping to prevent that (what happens if sys.path gets absurdly large?). It seems less useful in client scripts as they're probably run once and don't have this problem. I go back and forth on whether it's a worthwhile pattern to cart around.

@mistydemeo
Copy link
Contributor

This looks good! I admit I skimmed the JS, but otherwise looks like it's in good shape

@mistydemeo
Copy link
Contributor

Can the same file(s) appear in multiple SIPs?

@Hwesta
Copy link
Contributor Author

Hwesta commented Apr 30, 2014

Updated failedSIPCleanup to run before the (likely to fail) move to failed directory. Since the post-failed SIP hook only touches the SIPArrange table, DIPs should be unaffected.

The same file should not appear in multiple SIPs. Any entry in SIPArrange with the same original_path greys out the corresponding entry in the originals panel. Once an AIP is created, store_aip checks to see if all files in a backlogged transfer are in AIPs (as tracked by the SIPArrange table), and submits a delete request if that's the case. It's possible to get around these restrictions, but casual error should be avoided like this.

@Hwesta
Copy link
Contributor Author

Hwesta commented Apr 30, 2014

Rebased on current qa/1.x, with updates mentioned above.

Add directory open/close snapshotting.  Tweaked drag & drop positioning so
it won't mess with hover.  Fixed an issue where elements with spaces in the
CSS ID wouldn't be able to drag and drop. Got rid of spaces in CSS IDs.
Change file browser so name click handle can be overridden. Change backlog
viewer so you click to highlight a file then click a button to view the
file, rather than clicking on its name. Change so files can be moved within
arrange pane.
Added logic to rewrite processing config XML to remove pre-selection of
normalization identification method when creating SIPs from the arrange
directory. This is because one method references data which won't exist
because the files haven't actually been ran through the transfer phase
in their current incarnation.
Added a pop-up alert if the user is triyng to create
a SIP without first selecting a directory.
Added UI logic so you can't delete a directory that isn't in
the root level of the arrange directory.
Cleaned up SIP arrange JS, fixed minor issue, and moved JS out of
HTML template.
Added logic to prevent users from abusing copy to transfer endpoint.
Fixed minor issues, including issue with SIP create from backlog.
Fixed an inssue with copying single files into a potential SIP directory
in the SIP arrangement panel.
Hwesta added 27 commits May 29, 2014 16:43
refs #6131

Update create directory function to expect SIPArrange table.
Directories in SIPArrange now have NULL for the source path and
file_uuid, as there may be no original source, and we only care about
moving the actual files around.  Updated unit test and fixture to match.
refs #6131

Update delete endpoint to check if a file that doesn't exist is part of
SIP arrangement and delete it from the SIPArrange table.
Remove copy_to_originals, helper function, and unused transfer browser.
Transfer browser was an early attempt at SIP arrangement, superseded by
existing SIP arrangement.
refs #6131

Remove file UUIDs for directories, and added file UUID to SIPArrange
creation.  Remove passing shared directory information to JS, as that
will be handled Django-side.  Remove originals panel search, and moved
transfer backlog search to the top of the page.
refs #6131

Create SIP button in SIP Arrangement panel moves arranged files from backlog
to the processing space.  Database information not set up yet, as that needs
to be fetched from ElasticSearch or the Storage Service.

SIPArrange entries that are folders no longer have a source path, as this
leads to confusion when moving files.  arrange_paths are now unique, since
an arranged file shouldn't have multiple sources.

Add new create SIP function for arranged SIPs, and remove the old create
SIP function, helper functions, and endpoint.  Create arranged SIP
function creates the SIP object, adds the arranged files to that SIP and
updates the locations.  WARNING this only works with files that
originated on this pipeline.  Also, clean up headers.
refs #6131

All files in a Transfer will be indexed so that the metadata and logs will
be available to be viewed in the SIP Arrangement panel.  Indexing and
updating the index on a file do not verify that they are in the DB anymore.
The full file path is not indexed, but the relative path (including the
transfer folder) is.

Add a wrapper for search_raw that queries for MAX_QUERY_SIZE entries from
ElasticSearch.  The default number returned is 10, and there is no way to
request all entries that match a search.
refs #6131

Several updates to make the transfer backlog search populate the originals
panel in SIP Arrangement.

Backlog.js queries the dashboard, which queries ElasticSearch based on the
provided query parameters.  The resulting paths are rearranged into a format
that the Javascript file-explorer is expecting and returned.  Backlog.js
updates the originals panel and forces a re-render.

Cleanup of unneeded functions: originals content endpoint, create/delete JS
from backlog.js, backlog viewing template, remainder of originals pane
search code, misc other helper functions.
refs #6131

Originals hide button now hides the selected folder and all its children.
Arrange delete button given its own endpoint, so we don't have to guess
when a delete is for arrange instead of the filesystem.  file_browser
delete URL made configurable to support this.
refs #6131

If the source folder is the objects directory, add the contents of the
objects directory, not the directory itself.  Rearranged copy_to_arrange to
accomodate change and error checking.

Create sip has to be top level directory (cannot be a subfolder or the
arrange directory).
refs #6131

Delete doesn't delete directories that start with the same characters.
Create directory can create top level directories if nothing is selected.
Originals panel has some default text.  Don't display the 'files?' option
in backlog search.
refs #6131

Add attribute to transfer backlog results that specifies logs and metadata
directories (and their children) as not draggable.  Update the file browser
to recognize that attribute, and disable the drag & drop handler for those
elements.  Style those elements grey.
refs #6131

Fetch file UUID from ElasticSearch for arranged files, based on the original
relative path. Strip the UUID from a transfer name when creating a SIP from
a Transfer.
refs #6131

Enforcing unique arranged paths causes problems when a SIP that has been
created had the same path.  Remove requirement for unique arranged paths
for now.
refs #6131

Logs and metadata folders added to SIP when created, or empty folders
created if they don't exist.
refs #6131

A file should not be multiple SIPs.  If a file has been arranged (can be
found in SIPArrange.original_files), then it is greyed out in the originals
panel. When a file is dragged to the arranged panel, or deleted from the
arranged panel, the originals panel refreshes to get the updated status.

SIPArrange original paths are now unique, and when copying a folder, does
not copy files from originals that have already been arranged.
refs #6067, #6131

All paths sent via JSON are encoded in base64, to handle cases where
they contain non-unicode characters.  Files in SIP arrangement should
all be in unicode, but because it shares code with transfer source, the
code should be able to handle non-unicode characters, and be internally
consistent.

filesystem_ajax and ingest views updated to expect base64 paths and
return base64 paths as needed.  file-explorer and file-browser updated
to expect and return base64 paths as well.  browse_location encodes
outgoing paths to base64, and decodes the returned entries.

Also, other fixes from rebase.
restructureForCompliance.py contained several functions that were not
used in that script, but were imported into other client scripts.  Move
those functions to the correct client script.  Add common code to
archivematicaFunctions and import that.
refs #6131

Add microservice to restructure a SIP for compliance, where the files
are already stored in the DB.  Insert microservice after verify SIP for
compliance, if it fails.
refs #6131

Add moveToBacklog client script that uses the storage service to move the
Transfer to the configured backlog Location.  Update storeAIP to send
relative AIP path instead of absolute.  Update output when moving a file.
refs #6131

Update arranged status of files to track when a SIP fails or completes
successfully.

Index AIP marks files an being in an AIP, and submits a
deletion request to the storage service when all files in a transfer
are in an AIP.  Transfer UUID is tracked for each arranged file to
enable this.

Failed and rejected SIPs run failedSIPCleanup.py, which deletes
SIPArrange rows for each file in the SIP, making them arrangeable again.
refs #6131

Add two download files to filesystem_ajax: one that proxies to the
storage service, another which reads from local disk. Update view file
to use the storage servie, update browse AIP to use the local FS.
Update archival storage to use storage service helpers, so the base
storage service URL is only fetched in one place.
refs #6131

ElasticSearch 0.90/Lucene 4 now interprets / as being for regex, and they
must be escaped.
refs #6131

SIPs can be created from a folder at any level.  Change how the new SIP
name is parsed out, and add extra error checking to ensure that a SIP
cannot be created from a file.
refs #6131

Later microservices copy metadata and logs from the completedTransfers
directory, so copying the expected files there.
refs #6131

Update tests for code changes, including base64 and a default arrange
directory.  Add tests for new functions.  Add new more valid fixture
data.
@qubot qubot merged commit 6f61344 into qa/1.x May 30, 2014
@qubot qubot deleted the dev/issue-6022-siparrangement branch May 30, 2014 00:07
helenst added a commit to helenst/archivematica that referenced this pull request Sep 14, 2017
Defaults to 300 seconds which should give storage service time to process larger files before the API call times out.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants