Skip to content

Commit

Permalink
checkpoint
Browse files Browse the repository at this point in the history
  • Loading branch information
jeremyorr-hm committed Jul 20, 2023
1 parent 10e4044 commit 6d2a7ca
Show file tree
Hide file tree
Showing 3 changed files with 31 additions and 5 deletions.
6 changes: 4 additions & 2 deletions docs/guide/user-guide/01-register.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
# Register
From the main landing page select register in the top right

![Initial Page](../img/starting-point.png)

## Sign Up
The registration process requires a name, an email address and then you need to set a password.
The registration process requires a name, an email address and then you need to choose a password.
> N.B Your password must be at least 8 characters long and contain a mix of upper and lower case letters, numbers and other characters. You will also need to agree to the terms of service
![Register](../img/registration.png)
Once you have successfully registered you will receive a verification code at the email address you registered with, it will be sent to you from admin@utterworks.com. Sign in with the credentials you have just created and then enter the verification code as prompted.
![Verification](../img/verification.png)
Congratulations, you are now signed in and ready to create your first search App :smile:
Congratulations, you are now signed in and ready to create your first search project :smile:
4 changes: 2 additions & 2 deletions docs/guide/user-guide/02-create-app.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Create a Search Project
From the landing page, select Indexes from the menu bar top right to get started
From the landing page as a signed in user, select Indexes from the menu bar top right to get started

![Landing Page](../img/landing-page.png)

Expand All @@ -10,7 +10,7 @@ Create a new search project using the add new project buttom
Enter a project name..
>The name should be a max of 10 characters - with no spaces, a mix of lowercase letters, numbers and hyphens are allowed - this will be the reference used for the app in the API.
Next, provide a meaningful title, and then select the source for the content to be indexed. This can be either a website (to crawl or configure a sitemap), or an AWS S3 bucket.
Next, provide a meaningful description, and then select the source for the content to be indexed. This can be either a website (to crawl or configure a sitemap), or an AWS S3 bucket.

![New project dialog](../img/new-project-dialog.png)

Expand Down
26 changes: 25 additions & 1 deletion docs/guide/user-guide/03-configure-project.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,31 @@
# Configure a project
In the projects view, clicking on the project name takes you to the project configuration.

## Index Configuration
The first step to setting up a new project is to configure the source of the content to be included in the search app - we call this the index. The index configuration is a view on the left hand project navigation view, or if the project is new and hasn't yet had an index configured there is a configure index button top right

The first step in setting up a new project is to configure the source of the content to be included in the search project - we call this the index. The index configuration is a view on the left hand project navigation control, or if the project is new and hasn't yet had an index configured there is a configure index button top right
![Project View](../img/new-project-home.png)

### URLs
#### Start URL
With a website search project the start url provides the root of the indexing job. This could be a specific url, or it could be a sitemap xml document that represents some or all of the content on the website. The domain (e.g. the example.com portion of the start url) controls the main scope of the indexing job. Only content in the same domain will be indexed. If you want to capture content from several domains then multiple url configurations can be created in the same project.

#### Allowed Path Patterns

#### Blocked Path Patterns

#### XPath

#### Wait XPath

#### Follow Links

### Automatic Document Classification
The Find service includes the ability to automatically give indexed content a classification based on a list of provided classifications. This uses a process called Zero Shot classification and is unsupervised, as documents are indexed a classifier determines which of the provided classifications best fits the content. The accuracy of this process is improved by making the classification labels as meaningful and distinct as possible

### Indexed Content Processing
An index can be configured to index content by pages, or if the "split content into paragraphs" option is selected, the content is indexed at the paragraph level. Paragraph indexing increases the granularity of the search results, but can sometimes lose context that is inferable for the full page content.

### Meta Tags to extract
If your content is already categorised (perhaps by your CMS), adding the tags that contain this information in thhis section means those tags will be extracted, stored, and retrieved as context for the indexed content - this can be useful for faceted search

0 comments on commit 6d2a7ca

Please sign in to comment.