Key | Value |
---|---|
Environment | |
Services | Lambda, Kinesis, Firehose, ElasticSearch, S3 |
Integrations | Terraform, AWS CLI |
Categories | Serverless; Event-Driven architecture |
Level | Intermediate |
GitHub | Repository link |
This Fuzzy Search application demonstrates how to set up an S3-hosted website that enables you to fuzzy-search a movie database. The sample application implements the following integration among the various AWS services:
- A data ingestion pipeline which allows adding movie data to an ElasticSearch index via:
- An AWS Lambda function, explosed via a fuction URL.
- The Lambda function sends the JSON payload to a Kinesis Data Stream.
- A Kinesis Firehose Delivery Stream forwards the data to an ElasticSearch domain.
- A frontend / website which:
- Has a simple search interface to search for movies in the database.
- The HTML page uses a plain JS script to query data using a second Lambda function.
- This Lambda function performs a fuzzy query on the movie index in the ElasticSearch cluster.
The following diagram shows the architecture that this sample application builds and deploys:
S3 Website that holds the website. [Lambda] (https://docs.localstack.cloud/user-guide/aws/lambda/) for feeding the Kinesis stream and performing the fuzzy-search. Kinesis for forwarding the data into Elasticsearch. Firehose for forwarding the data into Elasticsearch. Elasticsearch which actually holds the data.
- LocalStack Pro with the
localstack
CLI. - Terraform with the
tflocal
installed. - AWS CLI with the
awslocal
wrapper.
Start LocalStack Pro with the LOCALSTACK_API_KEY
pre-configured:
export LOCALSTACK_API_KEY=<your-api-key>
docker compose up -d
You can build and deploy the sample application on LocalStack by running ./run.sh
.
Here are instructions to deploy and test it manually step-by-step.
To build the Terraform application, run the following commands:
terraform init; terraform plan; terraform apply --auto-approve
This will create all ressources specified in main.tf
.
This can take can take a couple of minutes.
Once it is done, you will be able to save the following values into variables by executing these commands
ingest_function_url=$(terraform output --raw ingest_lambda_url)
elasticsearch_endpoint=$(terraform output --raw elasticsearch_endpoint)
The dataset we will use for this application is a selection of movies and their typical data such as name, author, genre, etc. Execute the following commands to make it available.
temp_dir=$(mktemp --directory)
movie_dataset_url="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/samples/sample-movies.zip"
curl -L $movie_dataset_url > $temp_dir/sample-movies.zip
unzip $temp_dir/sample-movies.zip -d $temp_dir/
For the data to properly work for our streaming use case, we need to remove the bulk insert instruction.
grep -v '^{ "index"' $temp_dir/sample-movies.bulk > $temp_dir/sample-movies-processed.bulk
mv $temp_dir/sample-movies-processed.bulk $temp_dir/sample-movies.bulk
We know populate the database with the actual entries via our lambda function. Execute the following code to insert the entries line by line. It will take quite some time to finish
cat $temp_dir/sample-movies.bulk | while read line
do
echo -n "."
echo $line | curl -s -X POST $ingest_function_url \
-H 'Content-Type: application/json' \
-d @- > /dev/null
done
Now you can access the website with its entries under http://movie-search.s3-website.localhost.localstack.cloud:4566/ . If e.g. you search for "Quentis", a misspelling of "Quentin", you should see entries that relate the director "Quentin Tarantino", similar to the following screenshot.
The localstack logs sometimes show error message in regards to the firehose propagation. While this might reduce the size of the database to some degree, it is still be sufficient for demonstration purposes.
We appreciate your interest in contributing to our project and are always looking for new ways to improve the developer experience. We welcome feedback, bug reports, and even feature ideas from the community. Please refer to the contributing file for more details on how to get started.