Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,7 @@ This layout provider also indexes the artifacts using the Lucene-based [Maven In

## Maven 2 Search Providers

Please, be aware that the Maven 2 layout provider (unlike most of the other layout providers) supports two search providers:

* [OrientDB (default)](../search-providers.md#orientdbsearchprovider)
* [Maven Indexer](../search-providers.md#mavenindexersearchprovider) (search provider)
The Maven 2 layout provider uses the [OrientDB (default)](../search-providers#orientdbsearchprovider).

## Classes of Interest

Expand Down
116 changes: 85 additions & 31 deletions docs/developer-guide/maven-indexer.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,65 +43,114 @@ except a record for the POM.
Maven indexer also may or may not store a `.pom` file as an artifact. However, firstly it tries to find matching _real_
artifact file in the file system, switching over to indexing that, instead of the `.pom` file.

### What's not indexable
### What's not indexable ?

The following file types are not indexable:

* `maven-metadata.xml` files
* `.properties` files
* checksum and signature files `.asc`, `.md5`, `.sha1`

### What Are Packed Indexes?

Packed indexes are either a complete compressed index, or a compressed subset of data which can be applied to an
existing index incrementally.

## What's the goal of packed indexes ?

Packed indexes are used for transferring indexes from the remote to the proxy/tool.

## What Is The Maven Indexer Used For In The Strongbox Project?

The Maven Indexer is used for integration with IDE-s.

The Maven indexes produced by most public repository managers (such as Maven Central), are usually rebuilt once a week,
as it can take quite a while to scan large repositories with countless small artifacts. Hence, these indexes have proven
to not be quite as up-to-date, as the real server's contents. For this reason, we are using OrientDB to keep more
accurate information.
## How Does The Maven Indexer Work In Strongbox ?

Strongbox allows you to download packed repository Maven Index. Every maven repository with indexing enabled serves the packed Maven Index.
Based on the repository type, the index is prepared as follows:

* [Hosted][hosted-repositories-link] repositories:

Strongbox stores the information of uploaded artifacts (in hosted repositories) in the OrientDB database. This Information is used
to create the hosted repository Maven Index. Strongbox serves following [Maven Indexer Fields][maven-indexer-fields-link] in indexer:

* artifactId;
* version;
* classifier;
* packaging/extension
* classnames
* lastModified
* size
* signatureExists
* sha1
* sourcesExists
* javadocExists

For each hosted maven repository defined in strongbox there should be a scheduled task configured to rebuild the index (unless you don't
want to serve Maven Index for some repository). [Rebuild Maven Indexes Cron Job](https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-layout-providers/strongbox-storage-maven-layout/strongbox-storage-maven-layout-provider/src/main/java/org/carlspring/strongbox/cron/jobs/RebuildMavenIndexesCronJob.java)
is scheduled at strongbox startup on time specified by the `cronExpression` value within `repositoryConfiguration` in [strongbox.yaml][strongbox-yaml-link]
configuration file for `repository` with `type` equal to `hosted`.

The process of rebuilding the hosted repository Maven Index purges previous index and recreates it from scratch using OrientDB to keep
more accurate information. Thanks to this the index is up-to-date, as the real server's contents.

* [Proxy][proxy-repositories-link] repositories:

Strongbox fetches the proxy repository Maven Index from remote host, stores it locally and serves it. For each proxy maven repository
defined in strongbox there should be a scheduled task configured to re-fetch the index (unless you don't want to serve Maven Index
for some repository). [Download Remote Maven Index Cron Job](https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-layout-providers/strongbox-storage-maven-layout/strongbox-storage-maven-layout-provider/src/main/java/org/carlspring/strongbox/cron/jobs/DownloadRemoteMavenIndexCronJob.java)
is scheduled at strongbox startup on time specified by the `cronExpression` value within `repositoryConfiguration` in [strongbox.yaml][strongbox-yaml-link]
configuration file for `repository` with `type` equal to `proxy`.

Strongbox supports incremental proxy repository Maven Index. It means that it will update the index by downloading only the missing Maven
Index parts that were not downloaded before. Thanks to this feature, strongbox saves the bandwidth costs. Once the soft parts are downloaded,
they are merged with the locally existing part and finally packed.

The Maven indexes produced by most public repository managers (such as Maven Central), are usually rebuilt once a week.

* [Group][group-repositories-link] repositories:

Strongbox creates the group repository Maven Index by merging their underlying repositories Maven Indexes. This process is recursive meaning that
root group repository will contain in the Maven Index all the information stored in every inner and outer vertex repository Maven Index.
[Merge Maven Group Repository Index Cron Job](https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-layout-providers/strongbox-storage-maven-layout/strongbox-storage-maven-layout-provider/src/main/java/org/carlspring/strongbox/cron/jobs/MergeMavenGroupRepositoryIndexCronJob.java)
is scheduled at strongbox startup on time specified by the `cronExpression` value within `repositoryConfiguration` in [strongbox.yaml][strongbox-yaml-link]
configuration file for `repository` with `type` equal to `group`.

The process of rebuilding the group repository Maven Index purges previous index and recreates it from scratch to keep more accurate information.

## Where Are The Maven Indexes Located in Strongbox ?

There are two types of Maven Indexer indexes:

* Local
* For hosted repositories, this contains the artifacts that have been deployed to this repository.
* For proxy repositories, this contains the artifacts which have been requested and cached from the remote repository.
* For group repositories, this contains the merged index from the underlying repositories.
* Remote
* This is downloaded from the remote repository and contains a complete index of what is available on the remote.

## Where Are The Maven Indexes Located?

Every repository has an index under the `strongbox-vault/storages/${storageId}/${repositoryId}/.index` directory
Every repository (with enabled indexing) has an index under the `strongbox-vault/storages/${storageId}/${repositoryId}/.index` directory
where the index is located.

* [Hosted](../knowledge-base/repositories.md#hosted) repositories have:
* Local: `strongbox-vault/storages/${storageId}/${repositoryId}/local/.index`
* [Proxy](../knowledge-base/repositories.md#proxy) repositories have:
* [Hosted][hosted-repositories-link] repositories: have:
* Local: `strongbox-vault/storages/${storageId}/${repositoryId}/local/.index`
* [Proxy][proxy-repositories-link] repositories have:
* Remote: `strongbox-vault/storages/${storageId}/${repositoryId}/remote/.index`
* [Group][group-repositories-link] repositories have:
* Local: `strongbox-vault/storages/${storageId}/${repositoryId}/local/.index`

## Do Maven Indexes Break And How To Repair Them?

Usually, you don't need to rebuild the index, because all artifact operations should be handled via the REST API.

However, there are cases like for example:
- Some artifacts have gone missing (hdd error, or somebody removed them and you need to restore one, or a whole batch of
them manually directly on the file system without not using the REST API)
- You have added/removed some artifact(s) manually on the file system and would like to update the index

## Packed Indexes
## How to force to rebuild the repository index ?

In contrast to unpacked indexes (which are used for searching and browsing the remote), packed indexes are used for
transferring indexes from the remote to the proxy/tool.
Use REST API endpoint:

### What Are Packed Indexes?
* `POST` `/api/maven/index/{storageId}/{repositoryId}`
* see also [MavenIndexController](https://github.com/strongbox/strongbox/blob/master/strongbox-web-core/src/main/java/org/carlspring/strongbox/controllers/layout/maven/MavenIndexController.java)

Packed indexes are either a complete compressed index, or a compressed subset of data which can be applied to an
existing index incrementally.
## How to download the packed repository index ?

### When Are Packed Indexes Generated?
Use REST API endpoint:

Packed indexes are generated when the index for a repository is rebuilt. They are not generated when a re-indexing
request for a path in the repository is executed.
* `GET` `/storages/{storageId}/{repositoryId}/.index/nexus-maven-repository-index.gz`
* see also [MavenArtifactController](https://github.com/strongbox/strongbox/blob/master/strongbox-web-core/src/main/java/org/carlspring/strongbox/controllers/layout/maven/MavenArtifactController.java)

## Information For Developers

Expand All @@ -110,7 +159,7 @@ The code for the Maven indexing is located under the [strongbox-storage-maven-la
## See Also
* [Maven Indexer: Github](https://github.com/apache/maven-indexer/)
* [Maven Indexer: About](http://maven.apache.org/maven-indexer-archives/maven-indexer-LATEST/index.html)
* [Maven Indexer: Fields](http://maven.apache.org/maven-indexer-archives/maven-indexer-LATEST/indexer-core/index.html)
* [Maven Indexer: Fields][maven-indexer-fields-link]
* [Maven Indexer: Core (Notes)](https://github.com/apache/maven-indexer/tree/master/indexer-core)
* [Maven Indexer: Examples](https://github.com/apache/maven-indexer/tree/master/indexer-examples)
* [Maven Indexer: Incremental Downloading](http://blog.sonatype.com/2009/05/nexus-indexer-20-incremental-downloading/)
Expand All @@ -121,4 +170,9 @@ The code for the Maven indexing is located under the [strongbox-storage-maven-la
* [Stackoverflow: [maven-indexer]](http://stackoverflow.com/questions/tagged/maven-indexer)


[strongbox-yaml-link]: https://github.com/strongbox/strongbox/blob/master/strongbox-resources/strongbox-storage-api-resources/src/main/resources/etc/conf/strongbox.yaml
[maven-indexer-fields-link]: http://maven.apache.org/maven-indexer-archives/maven-indexer-LATEST/indexer-core/index.html
[hosted-repositories-link]: ../knowledge-base/repositories.md#hosted
[proxy-repositories-link]: ../knowledge-base/repositories.md#proxy
[group-repositories-link]: ../knowledge-base/repositories.md#group
[strongbox-storage-maven-layout-provider]: https://github.com/strongbox/strongbox/tree/master/strongbox-storage/strongbox-storage-layout-providers/strongbox-storage-maven-layout/strongbox-storage-maven-layout-provider
63 changes: 4 additions & 59 deletions docs/developer-guide/search-providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,66 +3,16 @@
## Introduction

Search providers offer a way to execute searches against different search engines. By default, searches are executed
against OrientDB, unless a search provider has been specified.

The idea behind search providers is that certain layout providers could require their own search engine, as is the
case with Maven. Information about Maven artifacts is stored both in OrientDB and in the [Maven Indexer].
As the [Maven Indexer] can actually be consumed by various clients and tools (such as other repository managers,
IDE-s and so on), we provided a way to further extend the searches for both existing and future layout providers which
might also need to have their own search engine implementations, apart from the built-in one (OrientDB).

against OrientDB.

## Implemented Search Providers

### OrientDbSearchProvider

The [OrientDbSearchProvider] is the default search provider which uses OrientDB.

### MavenIndexerSearchProvider

The [MavenIndexerSearchProvider] is the search provider for Maven artifacts when the [Maven Indexer] Lucene indexes
should be queried.

## Implementing a Search Provider

Custom search providers should implement [SearchProvider] and register with [SearchProviderRegistry].
Currently [OrientDbSearchProvider] is the only one supported search provider and it uses OrientDB.

## Executing A Search Programmatically

### MavenIndexerSearchProvider Example

```java
@Inject
private ArtifactIndexesService artifactIndexesService;

// Run a search against the index and get a list of
// all the artifacts matching this exact GAV
SearchRequest request = new SearchRequest(storageId,
repositoryId,
"+g:" + groupId + " " +
"+a:" + artifactId + " " +
"+v:" + version,
MavenIndexerSearchProvider.ALIAS);

try
{
SearchResults results = artifactSearchService.search(request);

for (SearchResult result : results.getResults())
{
String artifactPath = result.getArtifactCoordinates().toPath();

logger.debug("Artifact path " + artifactPath);

// Do something else here that is more meaningful
}
}
catch (SearchException e)
{
logger.error(e.getMessage(), e);
}
```

### OrientDbSearchProvider Example

```java
Expand All @@ -76,8 +26,7 @@ String query = "groupId=org.carlspring.strongbox.searches;" +

SearchRequest request = new SearchRequest(storageId,
repositoryId,
query,
OrientDbSearchProvider.ALIAS);
query);

try
{
Expand All @@ -99,13 +48,9 @@ catch (SearchException e)
```

## See Also
* [Maven Indexer]
* [REST-API]


[SearchProvider]: https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-api/src/main/java/org/carlspring/strongbox/providers/search/SearchProvider.java
[SearchProviderRegistry]: https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-api/src/main/java/org/carlspring/strongbox/providers/search/SearchProviderRegistry.java
[OrientDbSearchProvider]: https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-api/src/main/java/org/carlspring/strongbox/providers/search/OrientDbSearchProvider.java
[MavenIndexerSearchProvider]: https://github.com/strongbox/strongbox/blob/master/strongbox-storage/strongbox-storage-layout-providers/strongbox-storage-maven-layout-provider/src/main/java/org/carlspring/strongbox/providers/search/MavenIndexerSearchProvider.java
[REST-API]: ../user-guide/rest-api.md
[Maven Indexer]: ./maven-indexer.md
[REST-API]: ../user-guide/rest-api.md