Skip to content

Commit

Permalink
Merge pull request #115 from lenisha/master
Browse files Browse the repository at this point in the history
Skills and samples updates
  • Loading branch information
gmndrg committed Nov 14, 2023
2 parents 752ba2a + da0fe27 commit 1928aaa
Show file tree
Hide file tree
Showing 45 changed files with 2,490 additions and 295 deletions.
71 changes: 71 additions & 0 deletions 01 - Search Index Creation/01.1 - BuiltIn Skills/README.md
@@ -0,0 +1,71 @@
# Adding Built In Skill to the Skillset

Add Sentiment Analysis Skill to the Skillset and verify that sentiment are generated and stored in the index.

Use https://learn.microsoft.com/en-us/azure/search/cognitive-search-skill-sentiment-v3 as reference for Skill inputs and outputs


- Add field `sentiment` to index
```json
{
"name": "sentiment",
"type": "Edm.String",
"searchable": true,
"sortable": true,
"filterable": true,
"facetable": true
}
```

- Add `"#Microsoft.Skills.Text.V3.SentimentSkill` to skillset
```json
{
"@odata.type": "#Microsoft.Skills.Text.V3.SentimentSkill",
"name": "sentiment",
"description": "",
"context": "/document",
"defaultLanguageCode": "en",
"modelVersion": "",
"includeOpinionMining": true,
"inputs": [
{
"name": "text",
"source": "/document/merged_text"
}
],
"outputs": [
{
"name": "sentiment",
"targetName": "sentiment"
},
{
"name": "confidenceScores",
"targetName": "confidenceScores"
},
{
"name": "sentences",
"targetName": "sentences"
}
]
}
```

- Update Indexer to add output mappings between skill output and index field

```json
{
"sourceFieldName": "/document/sentiment",
"targetFieldName": "sentiment"
}
```

**Refer** to Postman collection for more details


# Verify Index data

- Search for all docments that have 'GitHub` word in them sorting by sentiment

- Search all document and show sentiment and locations facets

- Search documents that have location in Europe

Large diffs are not rendered by default.

Expand Up @@ -31,6 +31,11 @@
"key": "cog_services_key",
"value": "",
"enabled": true
},
{
"key": "env_function_url",
"value": "",
"enabled": true
}
],
"_postman_variable_scope": "environment",
Expand Down
5 changes: 5 additions & 0 deletions 01 - Search Index Creation/Create-Index-Postman.md
Expand Up @@ -39,6 +39,11 @@ We recommend using this collection to create an initial index and then iterating

You can then *check the indexer status* to see if documents are processing or if there are any errors. If the indexer does not start running automatically, you can run the indexer manually.

## Verify Index

Use search explorer or postment to search data


## Additional Resources

For more help working with Postman, see the [documentation](https://learning.postman.com/docs/getting-started/introduction/) on the Postman website.
4 changes: 3 additions & 1 deletion 01 - Search Index Creation/README.md
Expand Up @@ -21,4 +21,6 @@ This folder includes three options for creating an index. Each of these approach

1. [Create a search index using the Azure Portal](./Create-Index-AzurePortal.md)
2. [Create a search index using PowerShell](./Create-Index-PowerShell.md)
3. [Create a search index using Postman](./Create-Index-Postman.md)
3. [Create a search index using Postman](./Create-Index-Postman.md)

4. Optionally - go thru Sentiment Analysis setup example in [01.1 - BuiltIn Skills](./01.1%20-%20BuiltIn%20Skills/)
7 changes: 5 additions & 2 deletions 02 - Web UI Template/README.md
Expand Up @@ -86,9 +86,12 @@ docker run -d --env-file .env -p 80:80 kmworkshop.azurecr.io/web-ui:latest

1. Visual Studio 2019 or newer - [Download](https://visualstudio.microsoft.com/downloads/)

## 1. Update appsettings.json
## 1. Update appsettings configuration

To configure your web app to connect to your Azure services, simply update the *appsettings.json* file.
To configure your web app to connect to your Azure services, update the *appsettings.json* file and rebuild container.

Or update web app configuration:
![](../images/appsettings.png)

This file contains a mix of required and optional fields described below.

Expand Down
103 changes: 103 additions & 0 deletions 03 - Data Science and Custom Skills/FormRecognizer Skill/README.md
@@ -0,0 +1,103 @@

# Form Recognizer Custom Skill

Follow MS Learn module [Build a Form Recognizer custom skill for Azure Cognitive Search ](https://learn.microsoft.com/en-us/training/modules/build-form-recognizer-custom-skill-for-azure-cognitive-search/4-exercise-build-deploy)
to create Form Recognizer service and deploy Azure Function using cloud shell.

Integrate a Form Recognizer Pre-Built Model for Invoices capability within the Cognitive Search pipeline

# AnalyzeInvoice

This custom skill extracts invoice specific fields using a pre trained forms recognizer model.


## Settings

This Azure function requires access to an [Azure Forms Recognizer](https://azure.microsoft.com/en-us/services/cognitive-services/form-recognizer/) resource. The [prebuilt invoice model](https://docs.microsoft.com/azure/cognitive-services/form-recognizer/concept-invoices) is available in the 2.1 preview API.


This function requires a `FORMS_RECOGNIZER_ENDPOINT` and a `FORMS_RECOGNIZER_KEY` settings set to a valid Azure Forms Recognizer API key and to your custom Form Recognizer 2.1-preview endpoint.



## Sample Input:

This sample data is pointing to a file stored in this repository, but when the skill is integrated in a skillset, the URL and token will be provided by cognitive search.

```json
{
"values": [
{
"recordId": "record1",
"data": {
"formUrl": "https://github.com/Azure-Samples/azure-search-power-skills/raw/master/SampleData/Invoice_4.pdf",
"formSasToken": "?st=sasTokenThatWillBeGeneratedByCognitiveSearch"
}
}
]
}
```

## Sample Output:

```json
{
"values": [
{
"recordId": "0",
"data": {
"invoices": [
{
"AmountDue": 63.0,
"BillingAddress": "345 North St NY 98052",
"BillingAddressRecipient": "Fabrikam, Inc.",
"DueDate": "2018-05-31",
"InvoiceDate": "2018-05-15",
"InvoiceId": "1785443",
"InvoiceTotal": 56.28,
"VendorAddress": "4567 Main St Buffalo NY 90852",
"SubTotal": 49.3,
"TotalTax": 0.99
}
]
}
}
]
}
```

## Sample Skillset Integration

In order to use this skill in a cognitive search pipeline, you'll need to add a skill definition to your skillset.
Here's a sample skill definition for this example (inputs and outputs should be updated to reflect your particular scenario and skillset environment):

```json
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"name": "formrecognizer",
"description": "Extracts fields from a form using a pre-trained form recognition model",
"uri": "[AzureFunctionEndpointUrl]/api/AnalyzeInvoice?code=[AzureFunctionDefaultHostKey]",
"httpMethod": "POST",
"timeout": "PT1M",
"context": "/document",
"batchSize": 1,
"inputs": [
{
"name": "formUrl",
"source": "/document/metadata_storage_path"
},
{
"name": "formSasToken",
"source": "/document/metadata_storage_sas_token"
}
],
"outputs": [
{
"name": "invoices",
"targetName": "invoices"
}
]
}
```

Refer to Postman Collection for more details.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
@@ -0,0 +1,5 @@
.git*
.vscode
local.settings.json
test
.venv
@@ -0,0 +1,130 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don’t work, or not
# install all needed dependencies.
#Pipfile.lock

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# Azure Functions artifacts
bin
obj
appsettings.json
local.settings.json
.python_packages
@@ -0,0 +1,6 @@
{
"recommendations": [
"ms-azuretools.vscode-azurefunctions",
"ms-python.python"
]
}
@@ -0,0 +1,13 @@
{
"version": "0.2.0",
"configurations": [

{
"name": "Attach to Python Functions",
"type": "python",
"request": "attach",
"port": 9091,
"preLaunchTask": "func: host start"
}
]
}
@@ -0,0 +1,8 @@
{
"azureFunctions.deploySubpath": ".",
"azureFunctions.scmDoBuildDuringDeployment": true,
"azureFunctions.pythonVenv": ".venv",
"azureFunctions.projectLanguage": "Python",
"azureFunctions.projectRuntime": "~2",
"debug.internalConsoleOptions": "neverOpen"
}

0 comments on commit 1928aaa

Please sign in to comment.