Skip to content

Dco-ai/php-jina

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A PHP Client for Jina

A tool to connect to Jina with PHP. This client will not work without a running Jina installation.

To see how that is set up go here: Jina Installation

Jina Documentation

For more information about Jina go here: Jina

Install with composer command

    composer require dco-ai/php-jina

Install using composer.json

from GitHub directly:

Add this to your composer.json file or create the file and put this in it.

{
  "name":  "dco-ai/php-jina",
  "repositories": [
    {
      "type": "svn",
      "url": "https://github.com/Dco-ai/php-jina.git"
    }
  ],
  "require": {
    "dco-ai/php-jina": "main"
  }
}

or from Packagist:

{
  "require": {
    "dco-ai/php-jina": "v1.*"
  }
}

now run composer with composer update

Configuration

This client needs to know a few things about your Jina project to make the connection.

The configuration is an associative array with the following fields:

Attribute

Type

Description

url (required)

string

The endpoint of your Jina application. This can be a public URL or a private one if this client is used on the same network.

port (required)

string

The port used in your Jina application

endpoints (required)

associative array

This is how this client knows what endpoint uses which method when making the curl request. Since Jina allows you to make custom endpoints we need to know how to handle them. The default is GET so if your endpoint is not set here then it will attempt the call using GET.

dataStore (optional)

associative array

This is an optional configuration used to identify the Data Store being used. Interaction between Data Stores inside of DocArray differs so this client needs to know in order to handle certain functionality accordingly. If no dataStore is identified then the default functions will be used which may cause unintended results.

Usage

A small example is src/example.php. This shows you how to load the class and then create/update Jina's Document and DocumentArray structures.

First include the package in the header:

<?php
use DcoAi\PhpJina\JinaClient;

Then Instantiate the JinaClient class with your configuration:

$config = [
    "url" => "localhost", // The URL or endpoint of your Jina installation
    "port" => "1234", // The port used for your Jina Installation
    "endpoints" => [ // These are the active endpoints in your Jina application with the corresponding method
        "/status" => "GET",
        "/post" => "POST",
        "/index" => "POST",
        "/search" => "POST",
        "/delete" => "DELETE",
        "/update" => "PUT",
        "/" => "GET"
    ]
];
$jina = new JinaClient($config);

Now you can use these functions:

// this creates a Document that you can add data to the structure
$d = $jina->document();

// This creates a DocumentArray that Documents can be added to
$da = $jina->documentArray();

// This adds Documents to a DocumentArray
$jina->addDocument($da, $d);

// This sends the DocumentArray to your JinaClient application and returns the result.
$jina->submit("/index",$da);

Structures

Document

Attribute

Type

Description

id

string

A hexdigest that represents a unique document ID. It is recommended to let Jina set this value.

blob

bytes

the raw binary content of this document, which often represents the original document

tensor

ndarray-like

the ndarray of the image/audio/video document

text

string

a text document

granularity

int

the depth of the recursive chunk structure

adjacency

int

the width of the recursive match structure

parent_id

string

the parent id from the previous granularity

weight

float

The weight of this document

uri

string

a uri of the document could be: a local file path, a remote url starts with http or https or data URI scheme

modality

string

modality, an identifier to the modality this document belongs to. In the scope of multi/cross modal search

mime_type

string

mime type of this document, for blob content, this is required; for other contents, this can be guessed

offset

float

the offset of the doc

location

float

the position of the doc, could be start and end index of a string; could be x,y (top, left) coordinate of an image crop; could be the timestamp of an audio clip

chunks

array

array of the sub-documents of this document (recursive structure)

matches

array

array of matched documents on the same level (recursive structure)

embedding

ndarray-like

the embedding of this document

tags

/stdClass

a structured data value, consisting of field which map to dynamically typed values.

scores

/stdClass

Scores performed on the document, each element corresponds to a metric

evaluations

/stdClass

Evaluations performed on the document, each element corresponds to a metric

DocumentArray

Attribute

Type

Description

data

array

an array of Documents

parameters

/stdClass

a key/value set of custom instructions to be passed along with the request to Jina

targetExecutor

string

A string indicating an Executor to target. Default targets all Executors

Filters

Filters are unique to each Data Store in DocArray. the structure and how they are passed is dependent on how you have your Executors set up. For every example I am providing I am assuming your Executors accept a filter key in the parameters section of the request. If your Executors are set up to accept filters in a different way you will need to modify the request accordingly.

In this client you can build a filter by chaining together filter functions. First you have to create an instance of the Filter class with the useFilterFormatter() function.

use DcoAi\PhpJina\JinaClient;
// set the config and create a new instance of the JinaClient
$config = [
    "url" => "localhost",
    "port" => "1234",
    "endpoints" => [
        "/status" => "GET",
        "/post" => "POST",
        "/index" => "POST",
        "/search" => "POST",
        "/delete" => "DELETE",
        "/update" => "PUT",
        "/" => "GET",
    ]
];
$jina = new JinaClient($config);

// create a new instance of the filter class
$filterBuilder = $jina->useFilterFormatter();

Now that you have the filter this is how you would chain together a basic filter:

$filterBuilder->
    and()->
        equal("env","dev")->
        equal("userId","2")->
    endAnd()->
    or()->
        notEqual("env","2")->
        greaterThan("id","5")->
    endOr()->
    equal("env","dev")->
    notEqual("env","prod");

Some Data Stores will have grouping operators like and and or that will allow you to group filters together. If the Data Store has these operators there will be a closing function which corresponds to the opening function.

Once you have your filter built you will need to retrieve it from the Filter class and add it to the request. This is not done automatically.

// Lets make an empty DocumentArray
$da = $jina->documentArray();
// And add the filter to the parameters 
$da->parameters->filter = $filterBuilder->createFilter();
// print ths document and see what we got
print_r(json_encode($da, JSON_PRETTY_PRINT));

This filter will produce a string like this:

{
    "data": [],
    "parameters": {
        "filter": [
            {
                "$and": [
                    {
                        "env": {
                            "$eq": "dev"
                        }
                    },
                    {
                        "userId": {
                            "$eq": "2"
                        }
                    }
                ]
            },
            {
                "$or": [
                    {
                        "env": {
                            "$ne": "2"
                        }
                    },
                    {
                        "id": {
                            "$gt": 5
                        }
                    }
                ]
            },
            {
                "env": {
                    "$eq": "dev"
                }
            },
            {
                "env": {
                    "$ne": "prod"
                }
            }
        ]
    }
}

This example is a bit complicated and probably not useful, but it shows what can be done.

Default Filter

DocArray has a default filter structure that can be used by this client without any configuration changes. Documentation can be found here: Documentation

This is a list of the operators that are supported by the Default filter. The $column is the field you are filtering on and the $value is the value you are filtering on.

Query Operator

Chainable Function

Description

$eq

equal($column, $value)

Equal to (number, string)

$ne

notEqual($column, $value)

Not equal to (number, string)

$gt

greaterThan($column, $value)

Greater than (number)

$gte

greaterThanEqual($column, $value)

Greater than or equal to (number)

$lt

lessThan($column, $value)

Less than (number)

$lte

lessThanEqual($column, $value)

Less than or equal to (number)

$in

in($column, $value)

Is in an array

$nin

notIn($column, $value)

Not in an array

$regex

regex($column, $value)

Match the specified regular expression

$size

size($column, $value)

Match array/dict field that have the specified size. $size does not accept ranges of values.

$exists

exists($column, $value)

Matches documents that have the specified field; predefined fields having a default value (for example empty string, or 0) are considered as not existing; if the expression specifies a field x in tags (tags__x), then the operator tests that x is not None.

The list of combining functions for the Default filter is here:

Operator

Chainable Function

Closing Function

Description

$and

and() endAnd()

Join query clauses with a logical AND by chaining operator function between these two functions.

$or

or() endOr()

Join query clauses with a logical OR by chaining operator function between these two functions.

$not

not() endNot()

Inverts the effect of a query expression that is chained between these two functions.

AnnLite Filter

This filter uses the AnnLite Data Store and is very similar to the Default filter with some minor differences. Documentation can be found here: Documentation

To use this filter you must add the "type" => "annlite" key to the dataStore array in the configuration.

use DcoAi\PhpJina\JinaClient;
// set the config and create a new instance of the JinaClient
$config = [
    "url" => "localhost",
    "port" => "1234",
    "endpoints" => [
        "/status" => "GET",
        "/post" => "POST",
        "/index" => "POST",
        "/search" => "POST",
        "/delete" => "DELETE",
        "/update" => "PUT",
        "/" => "GET",
    ],
    "dataStore" => [
        "type" => "annlite",
    ]
];
$jina = new JinaClient($config);

Like all other filters you can build it using the chaining method. Here are the specific fields using this Data Store:

Query Operator

Chainable Function

Description

$eq

equal($column, $value)

Equal to (number, string)

$ne

notEqual($column, $value)

Not equal to (number, string)

$gt

greaterThan($column, $value)

Greater than (number)

$gte

greaterThanEqual($column, $value)

Greater than or equal to (number)

$lt

lessThan($column, $value)

Less than (number)

$lte

lessThanEqual($column, $value)

Less than or equal to (number)

$in

in($column, $value)

Is in an array

$nin

notIn($column, $value)

Not in an array

The list of combining functions for the Default filter is here:

Operator

Chainable Function

Closing Function

Description

$and

and() endAnd()

Join query clauses with a logical AND by chaining operator function between these two functions.

$or

or() endOr()

Join query clauses with a logical OR by chaining operator function between these two functions.

Weaviate Filter

This filter uses the Weaviate Data Store and uses GraphQL as the query language. Since this language is dependent on the schema in the DB we need to connect to your Weaviate instance and retrieve the schema to build the query. This is done automatically, but you will need to add the url and port parameters when creating the JinaClient instance. Documentation can be found here: Documentation

To use this filter you must add the "type" => "weaviate" key to the dataStore array in the configuration.

use DcoAi\PhpJina\JinaClient;
// set the config and create a new instance of the JinaClient
$config = [
    "url" => "localhost",
    "port" => "1234",
    "endpoints" => [
        "/status" => "GET",
        "/post" => "POST",
        "/index" => "POST",
        "/search" => "POST",
        "/delete" => "DELETE",
        "/update" => "PUT",
        "/" => "GET",
    ],
    "dataStore" => [
        "type" => "weaviate",
        "url" => "localhost",
        "port" => "8080",
    ]
];
$jina = new JinaClient($config);

Here are the specific fields using this Data Store:

Query Operator

Chainable Function

Description

Not

not($column, $value)

Exclude the value from the query

Equal

equal($column, $value)

Equal to the value

NotEqual

notEqual($column, $value)

Not equal to the value

GreaterThan

greaterThan($column, $value)

Greater than the value

GreaterThanEqual

greaterThanEqual($column, $value)

Greater than or equal to the value

LessThan

lessThan($column, $value)

Less than the value

LessThanEqual

lessThanEqual($column, $value)

Less than or equal to the value

Like

like($column, $value)

Allows you to do string searches based on partial match

WithinGeoRange

withinGeoRange($column, $value)

A special case of the Where filter is with geoCoordinates. If you've set the geoCoordinates property type, you can search in an area based on distance.

IsNull

isNull($column, $value=true or false)

Allows you to do filter for objects where given properties are null or not null. Note that zero-length arrays and empty strings are equivalent to a null value.

The list of combining functions for the Default filter is here:

Operator

Chainable Function

Closing Function

Description

$and

and() endAnd()

Join query clauses with a logical AND by chaining operator function between these two functions.

$or

or() endOr()

Join query clauses with a logical OR by chaining operator function between these two functions.

All Other Data Store Filters

Currently, these are not supported but are planned for future releases. If you would like to contribute to this project please feel free to submit a pull request and reach out to me for any questions.

Response

To save on data transfer and memory when making calls to your Jina application this client will clean up the request and response automatically by removing any key where the value is not set. Keep this in mind when performing evaluations on the response by checking if the key exists first.

If you want all the values returned you can set a flag when using the submit() function

// setting the third parameter to false will not remove any empty values from the response
$jina->submit("/index", $da, false);