Skip to content

shaikhammar/opencat-core

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

opencat/core

Shared data models, contracts, and enums for the OpenCAT Framework.

Every other OpenCAT package depends on this one. It contains no business logic — only the shapes that the rest of the framework passes around.

Installation

composer require opencat/core

Requires ext-intl and ext-mbstring.

What's inside

Models

Class Purpose
Segment Ordered sequence of string and InlineCode elements — one translatable unit
SegmentPair Source Segment + target Segment (null when untranslated), status, and lock flag
BilingualDocument Ordered collection of SegmentPair objects plus filter skeleton data
InlineCode A non-translatable formatting marker inside a Segment (bold tag, link, line break, etc.)
TranslationUnit A stored source/target pair in a translation memory, with metadata
MatchResult A TM lookup result — TranslationUnit plus similarity score and match type
QualityIssue One issue raised by a QA check — check ID, severity, message, and character offset
TermEntry A bilingual term pair from a glossary — source/target text, domain, and forbidden flag
TermMatch A term found in running text — TermEntry plus the matched span

Contracts (interfaces)

Interface Implemented by
FileFilterInterface filter-plaintext, filter-html, filter-docx, others
SegmentationEngineInterface segmentation
TranslationMemoryInterface translation-memory
MachineTranslationInterface mt
TerminologyProviderInterface terminology
QualityCheckInterface qa
DocumentQualityCheckInterface qa

Enums

Enum Values
SegmentStatus Untranslated, Draft, Translated, Reviewed, Approved, Rejected
SegmentState States for external interchange formats
InlineCodeType OPENING, CLOSING, STANDALONE
MatchType EXACT, EXACT_TEXT, FUZZY
QualitySeverity INFO, WARNING, ERROR

Exceptions

Each domain has its own exception class extending \RuntimeException:

FilterException · MtException · SegmentationException · TerminologyException · TmException

Working with Segment

A Segment holds an ordered mix of plain strings and InlineCode objects:

use CatFramework\Core\Model\Segment;
use CatFramework\Core\Model\InlineCode;
use CatFramework\Core\Enum\InlineCodeType;

$bold = new InlineCode('b1', InlineCodeType::OPENING, '<strong>');
$boldClose = new InlineCode('b1', InlineCodeType::CLOSING, '</strong>');

$segment = new Segment('seg-1', [
    'Hello ',
    $bold,
    'world',
    $boldClose,
    '!',
]);

$segment->getPlainText();    // "Hello world!"
$segment->isEmpty();         // false
$segment->getInlineCodes();  // [$bold, $boldClose]

Working with BilingualDocument

use CatFramework\Core\Model\BilingualDocument;
use CatFramework\Core\Model\SegmentPair;
use CatFramework\Core\Enum\SegmentStatus;

$doc = new BilingualDocument(
    sourceLanguage: 'en-US',
    targetLanguage: 'fr-FR',
    originalFile: 'report.docx',
    mimeType: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
);

foreach ($doc->getSegmentPairs() as $pair) {
    echo $pair->source->getPlainText();  // source text
    echo $pair->status->name;           // SegmentStatus enum name
    echo $pair->isLocked ? 'locked' : 'editable';
}

Implementing a custom file filter

use CatFramework\Core\Contract\FileFilterInterface;
use CatFramework\Core\Model\BilingualDocument;
use CatFramework\Core\Exception\FilterException;

class MyFilter implements FileFilterInterface
{
    public function supports(string $filePath, ?string $mimeType = null): bool
    {
        return str_ends_with(strtolower($filePath), '.myext');
    }

    public function extract(string $filePath, string $sourceLanguage, string $targetLanguage): BilingualDocument
    {
        // parse $filePath, create and return a BilingualDocument
    }

    public function rebuild(BilingualDocument $document, string $outputPath): void
    {
        // use $document->skeleton to reconstruct the file
    }

    public function getSupportedExtensions(): array
    {
        return ['.myext'];
    }
}

Related packages

About

Read-only mirror of opencat/core — split from shaikhammar/opencat-framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages