Skip to content

CluedIn-io/CluedIn.Enricher.Web

Repository files navigation

CluedIn.Enricher.Web

CluedIn External Search for Website Crawler.


Overview

This repository contains the code and associated tests for crawling a public website based of Entities and Clues that have set a value for the Organization.Website core vocabulary.

Usage

NuGet Packages

To use the Web External Search with the CluedIn server you will have to add the CluedIn.Enricher.Web nuget package to your environment.

Running Tests

A mocked environment is required to run integration and acceptance tests. The mocked environment can be built and run using the following Docker command:

docker-compose up --build -d

Use the following commands to run all Unit and Integration tests within the repository:

dotnet test .\ExternalSearch.Web.sln --filter Unit
dotnet test .\ExternalSearch.Web.sln --filter Integration

To run Pester acceptance tests

invoke-pester

To review the WireMock HTTP proxy logs

docker-compose logs wiremock

Tooling

About CluedIn

CluedIn is the Cloud-native Master Data Management Platform that brings data teams together enabling them to deliver the foundation of high-quality, trusted data that empowers everyone to make a difference.

We're different because we use enhanced data management techniques like Graph and Zero Upfront Modelling to accelerate the time taken to prepare data to deliver insight by as much as 80%. Installed in as little as 20 minutes from the Azure Marketplace, CluedIn is fully integrated with Microsoft Purview and the full Microsoft Fabric suite, making it the preferred choice for Azure customers.

To learn more about CluedIn, contact the team today.

https://www.cluedin.com