Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WEBDEV-5328 Integrate search service with beta search backend #16

Merged
merged 112 commits into from
Sep 22, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
112 commits
Select commit Hold shift + click to select a range
b08c343
Add new backends
latonv Aug 25, 2022
9f461e6
Remove fetchMetadata
latonv Aug 25, 2022
5194465
Add hit types
latonv Aug 26, 2022
7a3be31
Clean up minor doc issues
latonv Aug 26, 2022
e52c071
Reinstate fetchMetadata for sake of testing
latonv Aug 26, 2022
71ff6f1
Add schema for search hits
latonv Aug 26, 2022
0de2b50
Default to new metadata search
latonv Aug 26, 2022
16eb2e6
Adjust response modeling to conform to new backend
latonv Aug 29, 2022
affedd8
Toy with decoupling SearchService from its backends
latonv Aug 30, 2022
ff16e32
Clean up imports
latonv Aug 30, 2022
d76ab8d
Abstract out shared search backend methods/options
latonv Aug 30, 2022
2bfceec
Clean up type imports
latonv Aug 30, 2022
0bf27b0
Improve documentation
latonv Aug 30, 2022
04f6cc7
Add hit factory and merge type for hits
latonv Aug 30, 2022
55cc700
Organize hit properties
latonv Aug 30, 2022
81fdba5
Conform search response details to PPS shape
latonv Aug 31, 2022
242dfda
Fix Hit type alias quirks
latonv Aug 31, 2022
d870ae8
Add mediatype field to text hits
latonv Aug 31, 2022
51f6068
Update mock response factory
latonv Aug 31, 2022
73d3c3e
Add search request model
latonv Aug 31, 2022
80e8079
Document search backends
latonv Aug 31, 2022
e35557b
Improve type safety of search response details
latonv Aug 31, 2022
874f1e4
Excise metadata fetching (now in separate service)
latonv Aug 31, 2022
f8ea8ae
Update search request params
latonv Aug 31, 2022
519d097
Update demo app with search type options
latonv Aug 31, 2022
19056b4
Remove leftover fetchMetadata bits from tests
latonv Aug 31, 2022
42e252b
Collapse responses directory
latonv Sep 1, 2022
4c0a2c7
Fix tests and formatting
latonv Sep 1, 2022
70983cd
Add additional tests for new backends
latonv Sep 1, 2022
cc044e5
Add unit tests for search response details
latonv Sep 1, 2022
5c01c80
Avoid blank titles in demo
latonv Sep 1, 2022
a7d20c1
Better organize backend tests
latonv Sep 1, 2022
d947963
Rename default-search-backend and update package exports
latonv Sep 1, 2022
2f4f107
Add hit-type tests and fix boolean fields
latonv Sep 1, 2022
d5cf24a
Update sort params to PPS expected format
latonv Sep 1, 2022
8711a4d
Toggle unhelpful eslint rules
latonv Sep 1, 2022
e3fce48
Update README and version
latonv Sep 1, 2022
59586be
Further clarify usage in README
latonv Sep 1, 2022
1aa5c8a
Add sort options to demo
latonv Sep 1, 2022
79b07bb
Address eslint complaints
latonv Sep 1, 2022
bb2820d
Remove old advanced_search backend
latonv Sep 2, 2022
867180b
Add 'omit' option to aggregation params
latonv Sep 2, 2022
0349716
Fix demo app sort button
latonv Sep 2, 2022
40eaf2a
Formatting: missed a semicolon
latonv Sep 2, 2022
d0f480a
Memoize search backends to avoid redundancy
latonv Sep 2, 2022
822c303
Include page_type and page_target params
latonv Sep 2, 2022
6d255bb
Better document search params
latonv Sep 2, 2022
dd733b3
Make new params optional, fix formatting
latonv Sep 2, 2022
2dc06c8
Update URL param keys to PPS expectations
latonv Sep 2, 2022
a0efe06
Add tests for page_type and page_target params
latonv Sep 2, 2022
34e8460
Add unit test for omitting aggregations
latonv Sep 2, 2022
c9f24c3
Add additional unit tests for hit types
latonv Sep 2, 2022
c918cdd
Add aggregations to demo app
latonv Sep 6, 2022
4dc0172
Formatting
latonv Sep 6, 2022
e7ccca7
Update obsolete documentation to refer to PPS
latonv Sep 8, 2022
aea0e18
Add FTS snippets to demo app
latonv Sep 8, 2022
1d42ac6
Fix bug disallowing falsey search params
latonv Sep 8, 2022
467561b
Update obsolete documentation to refer to PPS
latonv Sep 8, 2022
799790e
Formatting
latonv Sep 8, 2022
72b3dba
Clean up snippet return types in demo app
latonv Sep 8, 2022
5f0ea01
Add a demo app field for setting # rows to fetch
latonv Sep 8, 2022
b5df1d4
Add aggregations_size param to requests and demo app
latonv Sep 8, 2022
ac6961b
Remove extra .ts extension on base-search-backend
latonv Sep 8, 2022
8b5421f
Move backend factory method onto SearchService
latonv Sep 8, 2022
58ac608
Rename SearchBackendOptions(Interface)
latonv Sep 8, 2022
28d8938
Prevent empty page_type and page_target
latonv Sep 8, 2022
90ec583
Better document search parameters
latonv Sep 9, 2022
c75abdb
Clarify doc wording around search URL generation
latonv Sep 9, 2022
1f64905
Remove empty search service constructor
latonv Sep 9, 2022
8b0df26
Move backend factory tests over to search service
latonv Sep 9, 2022
5bfa4b8
Add more documentation & clarifications
latonv Sep 9, 2022
2c5b813
Remove unavailable search types from the enum
latonv Sep 9, 2022
e1c07fe
Rename some hit-related types/properties for clarity
latonv Sep 9, 2022
0239e57
Simplify result type definition
latonv Sep 9, 2022
e3a4726
Clarify test descriptions
latonv Sep 9, 2022
9ca9291
Ensure error is not thrown for invalid hit types
latonv Sep 9, 2022
6b2ff1d
Improve search param docs
latonv Sep 9, 2022
49497ae
Add more documentation around search params & schemas
latonv Sep 9, 2022
73b4fb1
Improve documentation for search service interface
latonv Sep 9, 2022
88f2e60
Improve documentation for text/item hits
latonv Sep 9, 2022
5825bea
Better document aggregations_size param
latonv Sep 9, 2022
b2c3716
Format doc comments more prettily
latonv Sep 9, 2022
7e7530a
Add test for aggregations_size
latonv Sep 9, 2022
2d13b48
Move result factory method to SearchResponseDetails
latonv Sep 12, 2022
cda64d4
Refer to Metadata types in hit models
latonv Sep 12, 2022
a0451d8
Revert rawMetadata to explicit type
latonv Sep 13, 2022
08e1421
Clean up type imports
latonv Sep 13, 2022
5f56b1c
Add a breakdown of the search params to the README
latonv Sep 13, 2022
555d087
Normalize some search response properties to camel case
latonv Sep 13, 2022
b0d3102
Add README sections documenting search params and responses
latonv Sep 13, 2022
0733976
Adjust README for clarity
latonv Sep 16, 2022
8428bfb
Rename Result to SearchResult to avoid overlap with Result<T, E>
latonv Sep 16, 2022
361bf81
Formatting
latonv Sep 16, 2022
721d533
Add sorting method for aggregations
latonv Sep 19, 2022
2f06f98
Make aggregations immutable
latonv Sep 19, 2022
6d05537
Correctly construct SearchRequest object
latonv Sep 19, 2022
b9d6f5c
Specify aggregation options type
latonv Sep 20, 2022
e58734e
Better document aggregation sort options
latonv Sep 20, 2022
27199fa
Remove trailing whitespace
latonv Sep 20, 2022
583c7cb
Log debugging info if it is present on the response
latonv Sep 21, 2022
9c01a00
Add debugging checkbox to demo app
latonv Sep 21, 2022
546f47a
Move demo app queries into private fields
latonv Sep 21, 2022
ce63e7a
Make debug info logs easier to navigate
latonv Sep 21, 2022
18a98e2
Document logging method and avoid errors on missing fields
latonv Sep 21, 2022
4eb8f8e
Add unit tests for debugging output
latonv Sep 21, 2022
0d79504
Export AggregationSortType
latonv Sep 21, 2022
3c152e5
Add unit test for backend options
latonv Sep 22, 2022
c0f0bc0
Export search backend options interface
latonv Sep 22, 2022
9ba40db
Add unit tests for service path param
latonv Sep 22, 2022
d41c50d
Update lit to modern version (for demo app)
latonv Sep 22, 2022
e6e1e83
Better organized backend tests
latonv Sep 22, 2022
e500ab9
v0.4.0
latonv Sep 22, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 95 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Internet Archive Search Service

A service for searching and retrieving metadata from the Internet Archive.
A service for searching the Internet Archive.

## Installation
```bash
Expand All @@ -13,6 +13,7 @@ npm install @internetarchive/search-service
```ts
import {
SearchService,
SearchType,
SortParam,
SortDirection
} from '@internetarchive/search-service';
Expand All @@ -23,50 +24,118 @@ const params = {
query: 'collection:books AND title:(goody)',
sort: [dateSort],
rows: 25,
start: 0,
fields: ['identifier', 'collection', 'title', 'creator']
};

const result = await searchService.performSearch(params);
const result = await searchService.search(params, SearchType.METADATA);
if (result.success) {
const searchResponse = result.success;
searchResponse.response.numFound // => number
searchResponse.response.docs // => Metadata[] array
searchResponse.response.docs[0].identifier // => 'identifier-foo'
searchResponse.response.totalResults // => number -- total number of search results available to fetch
searchResponse.response.returnedCount // => number -- how many search results are included in this response
searchResponse.response.results // => Result[] array
searchResponse.response.results[0].identifier // => 'some-item-identifier'
searchResponse.response.results[0].title?.value // => 'some-item-title', or possibly undefined if no title exists on the item
}
```

### Fetch Metadata
Currently available search types are `SearchType.METADATA` and `SearchType.FULLTEXT`.

```ts
const metadataResponse: MetadataResponse = await searchService.fetchMetadata('some-identifier');
### Search parameters

metadataResponse.metadata.identifier // => 'some-identifier'
metadataResponse.metadata.collection.value // => 'some-collection'
metadataResponse.metadata.collection.values // => ['some-collection', 'another-collection', 'more-collections']
```
The `params` object passed as first argument to search calls can have the following properties:

## Metadata Values
#### `query`
The full search query, which may include Lucene syntax.

Internet Archive Metadata is expansive and nearly all metadata fields can be returned as either an array, string, or number.
#### `rows`
The maximum number of search results to be retrieved per page.

The Search Service handles all of the possible variations in data formats and converts them to their appropriate types. For instance on date fields, like `date`, it takes the string returned and converts it into a native javascript `Date` value. Similarly for duration-type fields, like `length`, it takes the duration, which can be seconds `324.34` or `hh:mm:ss.ms` and converts them to a `number` in seconds.
#### `page`
Which page of results to retrieve, beginning from page 1.
Each page is sized according to the `rows` parameter, so requesting `{ rows: 20, page: 3 }`
would retrieve results 41-60, etc.

There are parsers for several different field types, like `Number`, `String`, `Date`, and `Duration` and others can be added for other field types.
#### `fields`
An array of metadata field names that should be present on the returned search results.

See `src/models/metadata-fields/field-types.ts`
#### `sort`
An array of sorting parameters to apply to the results.
The first array element specifies the primary sort, the second element the secondary sort, and so on.
Each sorting parameter has the form
```js
{ field: string, direction: 'asc' | 'desc' }
```
where `field` is the name of the column to sort on (e.g., title) and `direction` is whether to sort ascending or descending.

#### `aggregations`
An object specifying which aggregations to retrieve with the query.
To retrieve no aggregations at all, this object should be `{ omit: true }`.
To retrieve aggregations for one or more keys, this object should resemble
```js
{ simpleParams: ['subject', 'creator', /*...*/] }
```

### Usage
To specify the number of buckets for individual aggregation types, the object
should instead use the `advancedParams` property, resembling
```js
{ advancedParams: [{ field: 'subject', size: 2 }, { field: 'creator', size: 4 }, /*...*/] }
```

```ts
metadata.collection.value // return just the first item of the `values` array, ie. 'my-collection'
metadata.collection.values // returns all values of the array, ie. ['my-collection', 'other-collection']
metadata.collection.rawValue // return the rawValue. This is useful for inspecting the raw response received.
However, these advanced aggregation parameters are not currently supported by the backend and may be removed at
a later date.

#### `aggregationsSize`
The number of buckets to be returned for all aggregation types.
This defaults to 6 (the number of facets displayed for each type in the search results sidebar),
but can be overridden using this parameter to retrieve more/fewer buckets as needed.

#### `pageType`
A string indicating what type of page this data is being requested for. The search backend may
use a different set of default parameters depending on the page type. This defaults to
`'search_results'`, and currently only supports `'search_results' | 'collection_details'`, with
more types to be added in the future.

#### `pageTarget`
Used in conjunction with `pageType: 'collection_details'` to specify the identifier of the collection
to retrieve results for.

### Search types

At present the only two types of search available are Metadata Search (`SearchType.METADATA`)
and Full Text Search (`SearchType.FULLTEXT`). This will eventually be extended to support other
types of search including TV captions and radio transcripts. Calls that do not specify a search
type will default to Metadata Search.

### Return values

Calls to `SearchService#search` will return a Promise that either resolves to a `SearchResponse`
object or rejects with a `SearchServiceError`.

`SearchResponse` objects are structured similar to this example:

```js
{
rawResponse: {/*...*/}, // The raw JSON fetched from the server
request: {
clientParameters: {/*...*/}, // The original client parameters sent with the request
finalizedParameters: {/*...*/} // The finalized request parameters as determined by the backend
},
responseHeader: {/*...*/}, // The header containing info about the response success/failure and processing time
response: {
totalResults: 12345, // The total number of search results matching the query
returnedCount: 50, // The number of search results returned in this response
results: [/*...*/], // The array of search results
aggregations: {/*...*/}, // A record mapping aggregation names to Aggregation objects
schema: {/*...*/} // The data schema to which the returned search results conform
}
}
```

metadata.date.value // return the date as a javascript `Date` object
### Fetch Metadata

metadata.length.value // return the length (duration) of the item as a number of seconds, can be in the format "hh:mm:ss" or decimal seconds
```
As of v0.4.0, metadata fetching has been moved to the
[iaux-metadata-service](https://github.com/internetarchive/iaux-metadata-service) package
and is no longer included as part of the Search Service.

# Development

Expand Down
Loading