# Vulnerable Customers

4 key drivers of vulnerability:
1. Health – disabilities or illnesses that affect the ability to carry out day-to-day tasks
2. Life Events – major life events such as bereavement, job loss or relationship breakdown
3. Resilience – low ability to withstand financial or emotional shocks
4. Capability – low knowledge of financial matters or low confidence in managing money (financial capability) and low capability in other relevant areas such as literacy, or digital skills

Normalising this into 16 distinct topics:
1. Learning disability
2. Low income
3. Mental health issues
4. Health problems
5. Being a carer
6. Age
7. Physical disability
8. Lack of connectivity
9. Living alone
10. Lone parent
11. Loss of income
12. Leaving care
13. Bereavement
14. Relationship breakdown
15. Release from prison
16. Legal proceedings

Potential approaches:
1. Bag-of-Words: will need enough training data for us to come to some sensible features. This will essentially be a goal-seeking exercise because sensible features will need to include synonyms of the topic at hand.
2. Similarity measure: use a WordNet based similarity measure to monitor stream of text for mention of words close in meaning to these topics, ie. synonyms.

Let's start with the simpler option: approach 2, as this requires no training data

# NLTK semantic similarity

In [20]:
import pandas as pd
import numpy as np
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.corpus import wordnet
import synonym as syn

In [91]:
topic_dictionary = {'disability': 'disabled.n.01',
                    'death': 'die.v.02',
                    'health problems': 'ill.a.01',
                    'being a carer': 'care.v.02',
                    'living alone': 'alone.s.01',
                    'job loss (fired)': 'discharged.s.01',
                    'job loss (redundancy)': 'redundancy.n.02',
                    'job loss (furlough)': 'furlough.v.01'}

In [92]:
phrase = 'I worked in Adult Learning for a county council and my job was paid for by government funding. When the funds ran out, halfway through the financial year, there was suddenly no money to pay my wages. I was called to a meeting with four others (whose jobs were also dependant on the funding) and the news was broken to us. There would be a 12 week consultation period before the redundancy became final. We could choose to take the redundancy payment and go, or we could take another job in a different part of the council. If we refused the job offered to us, we could lose the redundancy settlement.'
syn.phrase_scorer(phrase, topic_dictionary, sim_thresh=0.7, return_hits=True)

{'disability': (0, []),
 'death': (0, []),
 'health problems': (0, []),
 'being a carer': (0, []),
 'living alone': (0, []),
 'job loss (fired)': (0, []),
 'job loss (redundancy)': (3, ['redundancy', 'redundancy', 'redundancy']),
 'job loss (furlough)': (0, [])}

In [None]:
# Suman: how best to share this .py file, does code look clean and tidy?
# ALex