GitHub - databricks-industry-solutions/transaction-embedding: In this solution accelerator, we build a data asset that captures a full picture of the consumer and goes beyond traditional demographics, income, product and services (who you are) and extends to transactional behavior and shopping preferences (how you bank)

In a previous solution accelerator, we demonstrated the need for a Lakehouse architecture to address one of the key challenges in retail banking, merchant classification. With the ability to classify card transactions data with clear brands information, retail banks can leverage this data asset further to unlock deeper customer insights. Moving from a traditional segmentation approach based on demographics, income and credit history towards behavioral clustering based on transactional patterns, millions of underbanked users with limited credit history could join a more inclusive banking ecosystem. Loosely inspired from the excellent work from Capital One and in line with our previous experience in large UK based retail banking institutions, this solution focuses on learning hidden relationships between customers based on their card transaction pattern. How similar or dissimilar two customers are based on the shops they visit?

antoine.amend@databricks.com

© 2022 Databricks, Inc. All rights reserved. The source in this notebook is provided subject to the Databricks License [https://databricks.com/db-license-source]. All included or referenced third party libraries are subject to the licenses set forth below.

library	description	license	source
PyYAML	Reading Yaml files	MIT	https://github.com/yaml/pyyaml

Instruction

To run this accelerator, clone this repo into a Databricks workspace. Switch to the web-sync branch if you would like to run the version of notebooks currently published on the Databricks website. Attach the RUNME notebook to any cluster running a DBR 11.0 or later runtime, and execute the notebook via Run-All. A multi-step-job describing the accelerator pipeline will be created, and the link will be provided. Execute the multi-step-job to see how the pipeline runs. The job configuration is written in the RUNME notebook in json format. The cost associated with running the accelerator is the user's responsibility.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
config		config
images		images
tests		tests
utils		utils
.gitignore		.gitignore
00_transbed_context.py		00_transbed_context.py
01_transbed_etl.py		01_transbed_etl.py
02_transbed_ml.py		02_transbed_ml.py
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instruction

About

Releases

Packages

Contributors 3

Languages

License

databricks-industry-solutions/transaction-embedding

Folders and files

Latest commit

History

Repository files navigation

Instruction

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages