Skip to content

Commit

Permalink
Merge pull request tsugiproject#11 in UELA/tsugi from dev to qa
Browse files Browse the repository at this point in the history
* commit 'cb75a4b5ebf5aca1e4ab89e24ea98a25cb7957cb': (174 commits)
  Production config.
  Rachet accept all
  Protect the web sockets.
  Use the WebSocket class
  Start using WebSocket utility
  First cut at the socket server tester
  Suggest not to use LTI 2.x
  Round trip koseu Application changes.
  Make some routes depend on session.
  Require a session.
  Actually do the ghost busting - doh.
  Add a note about when ghost busting happens
  Remove duplicate uniqueness clauses
  Fix the spinner as blocker.
  Tuypoe
  Attempt to fix the spinner only frames in chrome.
  Web socket progress
  Round trip the EduAppCenter fixes
  Fix using EduAppCenter
  Add simple room feature.
  ...
  • Loading branch information
davidpbauer committed Aug 2, 2018
2 parents d6f9b6e + cb75a4b commit 0341d0c
Show file tree
Hide file tree
Showing 3,009 changed files with 139,999 additions and 25,219 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
4 changes: 2 additions & 2 deletions admin/admin_util.php
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
// TODO: deal with headers sent...
function requireLogin() {
global $CFG, $OUTPUT;
if ( ! isset($_SESSION['user_id']) ) {
if ( $CFG->google_glient_id && ! isset($_SESSION['user_id']) ) {
$_SESSION['error'] = 'Login required';
$OUTPUT->doRedirect($CFG->wwwroot.'/login.php') ;
exit();
Expand All @@ -21,7 +21,7 @@ function isAdmin() {

function requireAdmin() {
global $CFG, $OUTPUT;
if ( $_SESSION['admin'] != 'yes' ) {
if ( $CFG->google_glient_id && $_SESSION['admin'] != 'yes' ) {
$_SESSION['error'] = 'Login required';
$OUTPUT->doRedirect($CFG->wwwroot.'/login.php') ;
exit();
Expand Down
92 changes: 92 additions & 0 deletions admin/blob/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@

All About Blobs
---------------

In Tsugi, there are three places where blobs are described / stored.

* The `blob_file` table which has information about the context, link, file name,
media type, etc for a file. Everything but the blob itself.

* The `blob_blob` table which is basically for blob content indexed by the `blob_id`
and `blob_sha256`. There is only one blob for each sha256 value - so there might be
more than one `blob_file` entry pointing to a `blob_id` in the `blob_blob` table.
This is how Tsugi implements single instance store.

* The `$CFG->dataroot` area on disk. This stores blobs in a folder structure based
on the sha256 of the blob. There is a two-level folder hierarchy based on the sha256.
The top folder is the first two characters in the sha256 and the second folder is characters
2-3 and the file name is the entire sha256. This also is a single instance store
so more than one `blob_file`entry can point to one file on disk through the `path` column.

In the versions of Tsugi before 2018-02, the blobs were stored in the `content` column
in the `blob_file` table, indexed by `context_id` but not `link_id`. It was a weird
single instance within course structure. The approach after 2018-02 is single-instance
across the system and `blob_file` indexed by both `context_id` and `link_id`.

In the post 2018-02 Tsugi we only store blobs in `blob_blob` or `$CFG->dataroot` but
the Access code serves from any of the three locations. Eventually in time, we will
migrate all blobs stored in `blob_file` into `blob_blob` and make the `blob_file.content`
column obsolete.

Test Harness
------------

The easiest way to test the blob store is to use the blob sample code from:

https://github.com/tsugiproject/tsugi-php-samples

Sweet test script to fake legagy blobs
--------------------------------------

Upload some files into `blob_blob` (i.e. `$CFG->dataroot` is not set) and then
run this to get some "legacy" files with content in `blob_file`.

Don't run this on a production database:

UPDATE blob_file, blob_blob SET blob_file.content=blob_blob.content
WHERE blob_file.blob_id = blob_blob.blob_id ;
UPDATE blob_file SET blob_id=null;
DELETE from blob_blob;

Then you can test migration from legacy `blob_file` to `blob_blob`.

Sample Executions of Admin Scripts in admin/blob
------------------------------------------------

$ php blobcheck.php
This is a dry run, use 'php blobcheck.php remove' to actually remove the blobs.
DELETE 4 69893b55bd8c9c5c53df72e3ea7e325cd2df8729d87b4dedce43630d668e6e1b
# unreferenced blobs found=1 delete=1

$ php blobcheck.php remove
This IS NOT A DRILL!
...
DELETE 4 69893b55bd8c9c5c53df72e3ea7e325cd2df8729d87b4dedce43630d668e6e1b
# unreferenced blobs found=1 delete=1

$ php blobcheck.php
This is a dry run, use 'php blobcheck.php remove' to actually remove the blobs.
# unreferenced blobs found=0 delete=0

$ php filecheck.php
This is a dry run, use 'php filecheck.php remove' to actually remove the files.
rm /Users/csev/tsugi_blobs/7c/95/7c954...98548f526218ef633152934334967
This is a dry run, use 'php filecheck.php remove' to actually remove the files.
# folders scan=4 skip=0 rm=0
# files scan=2 skip=0 good=1 rm=1

$ php filecheck.php remove
This IS NOT A DRILL!
...
rm /Users/csev/tsugi_blobs/7c/95/7c954...98548f526218ef633152934334967
rmdir /Users/csev/tsugi_blobs/7c/95
rmdir /Users/csev/tsugi_blobs/7c
# folders scan=4 skip=0 rm=2
# files scan=2 skip=0 good=1 rm=1

$ php filecheck.php
This is a dry run, use 'php filecheck.php remove' to actually remove the files.
This is a dry run, use 'php filecheck.php remove' to actually remove the files.
# folders scan=2 skip=0 rm=0
# files scan=1 skip=0 good=1 rm=0

138 changes: 138 additions & 0 deletions admin/blob/TESTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@

How to Test The Blob Storage
----------------------------

Blob Storage is pretty complex since Tsugi supports three ways to store blobs.

* Multi instance store in the database (blob_file)
* Single instance store in the database (blob_blob)
* Single instance store in the file system


Every uploaded file has information like MIME type, file name, etc. in a row
in the blob_file table.

Since we regularly/automatically clean out the 12345 key, as a very
special case, we store the blobs for 12345 in the `blob_file` in the content
column. There is no attempt to do single store in this case. If you want to override
this behavior, set the

$CFG->testblobs = false; // Store all blobs in the single-instance store
$CFG->testblobs = array('12345', '678910'); // More than one key blobs in blob_file
$CFG->testblobs = '12345'; // Default


Whether blobs are in the database or on disk is controlled by:

$CFG->dataroot = '/Users/csev/tsugi_blobs';

The default for this is `false` so blobs are stored in the database.


Test Harness
------------

The easiest way to test the blob store is to use the blob sample code from:

https://github.com/tsugiproject/tsugi-php-samples

Check it out and install it per its instructions.

Test Scenario
-------------

You will need to look at the database to verify your tests. Having a relatively empty
makes it easier.

Start with the defaults, dont set either `testblobs` or `dataroot`.

Make two copies of a file with different names.

* Run the test harness and launch the Blob tool - you are in the 12345 key.
* Upload both files and verify that there are two rows in `blob_file` and the blob
in both rows
* Look at each of the files in the browser and verify they both work
* Delete both files and verify that the `blob_file` rows are gone

Then set:

$CFG->testblobs = false; // To force 12345 to use the single instance store

Do not set `$CFG->dataroot`.

* Upload both files again. There should be two rows in the `blob_file` and neither
should have a blob in the `content` column. They should have the same value in the
`blob_id` column and in the `blob_blob` table there should be one row under the
`blob_id`
* Look at each of the files in the browser and verify they both work
* Navigate to `admin/blob` and run (you should not find any un referenced blobs)

php blobcheck.php

This is a dry run, use 'php blobcheck.php remove' to actually remove the blobs.
# unreferenced blobs found=0 delete=0

* Delete both files and verify that the `blob_file` rows are gone - but the `blob_blob`
row is still there!
* Navigate to `admin/blob` and run (you should see the unreferenced blob)

php blobcheck.php
This is a dry run, use 'php blobcheck.php remove' to actually remove the blobs.
DELETE 5 9dad808bd56037d66276e679f7401c08e8723603627abf8f2f7d63b5ff214fb7
# unreferenced blobs found=1 delete=1

* This should *not* delete the blob - to do so, run:

php blobcheck.php remove
This IS NOT A DRILL!
...
DELETE 5 9dad808bd56037d66276e679f7401c08e8723603627abf8f2f7d63b5ff214fb7
# unreferenced blobs found=1 delete=1

* Check in `blob_blob` and make sure the row is not there.

Now set `$CFG->dataroot` and leave `testblobs` false.

* Upload both files again. There should be two rows in the `blob_file` and neither
should have a blob in the `content` column. They should have the same value in the
`path` column. The `blob_id` should be null and there should be no row added to the
`blob_blob` table. The file will have a path based ont he sha-256 of the file like:

/Users/csev/tsugi_blobs/9d/ad/9dad808b...

* Check if the file exists and compare it to the original uploaded file.

* Look at each of the files in the Tsugi Blob Tool UI and verify they both work

* Navigate to `admin/blob` and run (you should not find any unreferenced files)

php filecheck.php
This is a dry run, use 'php filecheck.php remove' to actually remove the files.
# folders scan=2 skip=0 rm=0
# files scan=1 skip=0 good=1 rm=0

* Delete both files in the UI and verify that the `blob_file` rows are gone - but the file
on disk is still there!
* Navigate to `admin/blob` and run (you should see the unreferenced file)

php filecheck.php
This is a dry run, use 'php filecheck.php remove' to actually remove the files.
rm /Users/csev/tsugi_blobs/9d/ad/9dad808bd56037d66276e679f7401c0....
# folders scan=2 skip=0 rm=0
# files scan=1 skip=0 good=0 rm=1

* This should *not* delete the file - to do so, run:

php filecheck.php remove
This IS NOT A DRILL!
...
rm /Users/csev/tsugi_blobs/9d/ad/9dad808bd56037d66276e679f7401c08e8723603627abf8f2f7d63b5ff214fb7
rmdir /Users/csev/tsugi_blobs/9d/ad
rmdir /Users/csev/tsugi_blobs/9d
# folders scan=2 skip=0 rm=2
# files scan=1 skip=0 good=0 rm=1

* Check to make sure the file (and folders) are not there.



48 changes: 48 additions & 0 deletions admin/blob/blobcheck.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
<?php

use \Tsugi\Util\U;
use \Tsugi\Core\LTIX;

require_once("../../config.php");

if ( ! U::isCli() ) die('Must be command line');

LTIX::getConnection();

$dryrun = ! ( isset($argv[1]) && $argv[1] == 'remove');

if ( $dryrun ) {
echo("This is a dry run, use 'php blobcheck.php remove' to actually remove the blobs.\n");
} else {
echo("This IS NOT A DRILL!\n");
sleep(5);
echo("...\n");
sleep(5);
}

$stmt = $PDOX->query("SELECT BB.blob_id, BB.blob_sha256, BB.created_at, BB.accessed_at
FROM {$CFG->dbprefix}blob_blob AS BB
LEFT JOIN {$CFG->dbprefix}blob_file AS BF
ON BB.blob_sha256 = BF.file_sha256 AND BB.blob_id = BF.blob_id
WHERE BF.blob_id IS NULL");
$stmt->execute();

$checked = 0;
$deleted = 0;

while ( $row = $stmt->fetch(PDO::FETCH_ASSOC) ) {
$checked++;
$blob_sha256 = $row['blob_sha256'];
$blob_id = $row['blob_id'];
if ( ! $blob_id ) continue;

echo("DELETE $blob_id $blob_sha256\n");
$deleted++;
if ( ! $dryrun ) {
$s2 = $PDOX->prepare("DELETE FROM {$CFG->dbprefix}blob_blob
WHERE blob_id = :ID");
$s2->execute(array(':ID' => $blob_id));
}
}

echo("# unreferenced blobs found=$checked delete=$deleted\n");
Loading

0 comments on commit 0341d0c

Please sign in to comment.