Skip to content

Commit

Permalink
a large commit
Browse files Browse the repository at this point in the history
  • Loading branch information
pjc09h committed Jun 6, 2024
1 parent df072bb commit 9a7201d
Show file tree
Hide file tree
Showing 70 changed files with 2,870 additions and 3,293 deletions.
27 changes: 13 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,20 +18,19 @@ If you want to scale horizontally, the software supports both [Redis clusters](a
Please note that Redis clusters expect at least three nodes.
This lower limit is inherent to Redis' [cluster implementation](https://redis.io/docs/management/scaling/).

### Universal database id's

BioGazelle is in the process of migrating to [UUID v7 primary keys](https://uuid.ramsey.dev/en/stable/rfc4122/version7.html) to enable useful content-agnostic operations such as tagging and AI integration.
This will consolidate the database and allow for powerful cross-object association.
The UUIDs are stored as binary strings for index speed and to minimize disk usage.
By the way, *all* binary data is transparently converted by the [database wrapper](app/Database.php).

## Full stack search engine rewrite
## Deeply indexed and programmatically enhanced

Data indexing is important, so BioGazelle has upgraded to [Manticore Search](https://manticoresearch.com), the successor to Sphinx.
This upgrade also involved a [rewrite of the search configuration](utilities/config/manticore.conf) from scratch, based on AnimeBytes' example.
The Gazelle frontend itself uses a [rewritten browse.php controller](sections/torrents/browse.php) and a [brand new Twig template](templates/torrents/search.twig).
Oh yeah, the [PHP backend class](app/Manticore.php) is also completely rewritten, replacing at least four legacy classes.

### Universal database id's

BioGazelle is in the process of migrating to [short UUID primary keys](https://mariadb.com/kb/en/uuid_short/) to enable useful content-agnostic operations such as tagging and AI integration.
This will consolidate the database and allow for powerful cross-object association.


## Secure authentication system

The user handling, including registration, logins, etc., has been rewritten into a unified system in the [Auth class](app/Auth.php).
Expand Down Expand Up @@ -65,12 +64,6 @@ But we took it up a notch by upgrading this system to use the [modern WebAuthn s
use a hardware key, a smartphone fingerprint or QR code reader, or just generate a key in the browser.
The underlying library is the canonical [web-auth/webauthn-lib](https://github.com/web-auth/webauthn-lib).

## OpenAI integration

One of BioGazelle's goals is to place data in context using [OpenAI's completions API](app/OpenAI.php) to generate tl;dr summaries and tags from content descriptions.
Just paste your abstract into the torrent group description and get a succinct natural language summary with tags.
It's possible to disable AI content display in the user settings.

## Twig template system

[BioGazelle's Twig interface](app/Twig.php) takes cues from OPS's extended filters and functions.
Expand All @@ -94,6 +87,12 @@ BioGazelle uses [Starboard Notebook](https://starboard.gg) to support [Jupyter N
This lets users document technical topics such as data processing workflows complete with executable code examples and Latex expressions.
Our secure implementation leverages sanboxed iframes on a dedicated subdomain to ensure no cookie or local storage leaks.

### OpenAI integration

One of BioGazelle's goals is to place data in context using [OpenAI's completions API](app/OpenAI.php) to generate tl;dr summaries and tags from content descriptions.
Just paste your abstract into the torrent group description and get a succinct natural language summary with tags.
It's possible to disable AI content display in the user settings.

### Good typography

BioGazelle supports an array of [unobtrusive fonts](resources/scss/assets/fonts.scss) with the appropriate glyphs for bold, italic, and monospace.
Expand Down
2 changes: 1 addition & 1 deletion app/Auth.php
Original file line number Diff line number Diff line change
Expand Up @@ -782,7 +782,7 @@ public function resendConfirmation(int|string $identifier)

# try to resolve the email address
$identifier = \Gazelle\Escape::string($identifier);
$column = $app->dbNew->determineIdentifier($identifier);
$column = $app->dbNew->determineId($identifier);

# todo: maybe change unresolved id or uuid to null
# and let the backend decide which column to use?
Expand Down
3 changes: 2 additions & 1 deletion app/Better.php
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,8 @@ public static function missingCitations(bool $snatchedOnly = false): array
from torrents_group
{$subQuery}
where torrents_group.id not in
(select distinct groupId from literature_groups)
(select distinct contentId from literature_links)
and literature_links.contentType = 'torrentGroups'
order by rand() limit {$resultCount}
";

Expand Down
37 changes: 37 additions & 0 deletions app/Crossref.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
<?php

declare(strict_types=1);


/**
* Gazelle\Crossref
*
* @see https://api.crossref.org/swagger-ui/index.html
*/

namespace Gazelle;

class Crossref
{
# guzzle client
private \GuzzleHttp\Client $client;
private string $baseUri = "https://api.crossref.org";

# cache settings
private string $cachePrefix = "crossref:";
private string $cacheDuration = "1 hour";
private string $cacheAlgorithm = "sha3-512";


/**
* __construct
*/
public function __construct()
{
# https://docs.guzzlephp.org/en/stable/quickstart.html
$this->client = new \GuzzleHttp\Client([
"base_uri" => $this->baseUri,
"timeout" => 2.0,
]);
}
} # class
50 changes: 42 additions & 8 deletions app/Database.php
Original file line number Diff line number Diff line change
Expand Up @@ -278,34 +278,71 @@ public function shortUuid(): string
/**
* slug
*
* @param string $string
* @param ?string $string
* @return string
*
* @see https://laravel.com/api/master/Illuminate/Support/Str.html#method_slug
*/
public function slug(string $string): string
public function slug(?string $string): string
{
return \Illuminate\Support\Str::slug($string);
}


/**
* determineIdentifier
* determineId
*
* Determine the identifier to use for a query.
* Used for finding stuff by id, uuid, or slug.
*
* @param int|string $id
* @return string
*/
public function determineIdentifier(int|string $id): string
public function determineId(int|string $id): string
{
$app = App::go();

# cast to string
$id = strval($id);

# openAlex
$good = preg_match("/{$app->env->regexOpenAlex}/", $id);
if ($good) {
return "openAlexId";
}

# doi
$good = preg_match("/{$app->env->regexDoi}/i", $id);
if ($good) {
return "doi";
}

# orcid
$good = preg_match("/{$app->env->regexOrcid}/i", $id);
if ($good) {
return "orcid";
}

# issn
$good = preg_match("/{$app->env->regexIssn}/", $id);
if ($good) {
return "issn";
}

# rorId
$good = preg_match("/{$app->env->regexRor}/", $id);
if ($good) {
return "rorId";
}

# normal numeric id
if (is_int($id) || is_numeric($id)) {
return "id";
}

# default slug
return "slug";

/*
# https://ihateregex.io/expr/uuid/
if (is_string($id) && strlen($id) === 36 && preg_match("/{$app->env->regexUuid}/iD", $id)) {
Expand All @@ -317,9 +354,6 @@ public function determineIdentifier(int|string $id): string
return "uuid";
}
*/

# default slug
return "slug";
}


Expand Down Expand Up @@ -672,7 +706,7 @@ public function upsert(string $table, array $data = []): array
# it was updated, resolve a key from the data
foreach ($data as $key => $value) {
if (in_array(strtolower(strval($key)), ["id", "uuid", "slug"])) {
$column = $this->determineIdentifier($value);
$column = $this->determineId($value);
$query = "select * from {$table} where {$column} = ?";
return $this->row($query, [$value], ["hostname" => "source"]);
}
Expand Down
14 changes: 14 additions & 0 deletions app/Format.php
Original file line number Diff line number Diff line change
Expand Up @@ -575,4 +575,18 @@ public static function breadcrumbs()

return $crumbs;
}


/**
* tag
*
* Formats a tag for storage and display.
*
* @param ?string $string
* @return string
*/
public static function tag(?string $string): string
{
return \Illuminate\Support\Str::slug($string);
}
} # class
Loading

0 comments on commit 9a7201d

Please sign in to comment.