# Using *Key Point Analysis* service for analyzing and finding insights in a survey data 
When you have a large collection of texts representing people’s opinions (such as product reviews, survey answers or  social media), it is difficult to understand the key issues that come up in the data. Going over thousands of comments is prohibitively expensive.  Existing automated approaches are often limited to identifying recurring phrases or concepts and the overall sentiment toward them, but do not provide detailed or actionable insights.

In this tutorial you will gain hands-on experience in using *Key Point Analysis* (KPA) for analyzing and deriving insights from open-ended answers.  

The data we will use is [a community survey conducted in the city of Austin](https://data.austintexas.gov/dataset/Community-Survey/s2py-ceb7). In this survey, the citizens of Austin were asked "If there was ONE thing you could share with the Mayor regarding the City of Austin (any comment, suggestion, etc.), what would it be?". 

## 1. Run *Key Point Analysis* (data from 2016)

Lets first import all required packages for this tutoarial and initialize the *Key Point Analysis* client. The client prints information using the logger and a suitable verbosity level should be set. The client object is configured with an API key. It should be  retrieved from the [Project Debater Early Access Program](https://early-access-program.debater.res.ibm.com/) site.  In the code bellow it is passed by the enviroment variable *DEBATER_API_KEY* (you may also modify the code and place the api-key directly).

In [37]:
from debater_python_api.api.clients.keypoints_client import KpAnalysisClient, KpAnalysisTaskFuture
from debater_python_api.api.clients.key_point_analysis.KpAnalysisUtils import KpAnalysisUtils
import os
import csv
import random

KpAnalysisUtils.init_logger()
api_key = os.environ['DEBATER_API_KEY']
host = 'https://keypoint-matching-backend.debater.res.ibm.com'
keypoints_client = KpAnalysisClient(api_key, host)

### 1.1 Read the data and run *key point analysis*  over it
Let's read the data from *dataset_austin.csv* file, which holds the Austin survey dataset, and print the first comment.

In [38]:
with open('./dataset_austin.csv') as csv_file:
    reader = csv.DictReader(csv_file)
    comments = [dict(d) for d in reader]

print(f'There are {len(comments)} comments in the dataset')
print(comments[0])

There are 3187 comments in the dataset
{'id': '1', 'year': '2016', 'text': "Dissatisfied traffic and with traffic, timing of street lights.  EXTREMELY dissatisfied with cit govt. interfering in local businesses (Uber/Lyft, income property owners).  Also, extremely dissatisfied with all the free handouts to people who are perfectly capable of earning their own money.  I'm very dissatisfied with the liberal leaning local politicians."}


Each comment is a dictionary with an unique_id 'id' and 'text' and a 'year'. We will first remove all comments with text longer than 1000 characters since this is a systme's limit. Then we will filter the comments and take the ones from 2016. 

The *Key Point Analysis* service is able to run over hundreds of thousands of sentences, however since the computation is heavy in resources (particularly GPUs) the trial version is limited to 1000 comments. You may request to increase this limit if needed. Since we want the tutorial to be relativly fast and lightweight, we will only run on a sample of 400 comments. Note that running over a larger set improves both the quality and coverage of the results.

In [39]:
comments = [c for c in comments if len(c['text'])<=1000]
comments_2016 = [c for c in comments if c['year'] == '2016']
sample_size = 400
random.seed(0)
comments_2016_sample = random.sample(comments_2016, sample_size)

*Key point analysis* is a novel and promising approach for summarization, with an important quantitative angle. This service summarizes a collection of comments on a given topic as a small set of key points. The salience of each key point is given by the number of its matching sentences in the given comments.

In order to run *Key Point Analysis*, do the following steps:

### 1.2 Create a domain
The *Key Point Analysis* service stores the data (and cached-results) in a *domain*. A user can create several domains, one for each dataset. Domains are only accessible to the user who created them.

Create a domin using the **keypoints_client.create_domain(domain=domain, domain_params={})** method. Several params can be passed when creating a domain in the domain_params dictionary as described in the documentation. Leaving it empty gives us a good default behaviour. You can also use **KpAnalysisUtils.create_domain_ignore_exists(client=keypoints_client, domain=domain, domain_params={})** if you don't want an exception to be thrown if the domain already exists (note that in such case the domain_params are not updated and are remained as they where before). In this tutorial we will first delete the domain if it exists, since we want to start with an empty domain.

Full documentation of the supported *domain_params* and how they affect the domain can be found [here](kpa_parameters.pdf).

In [40]:
domain = 'austin_demo'
KpAnalysisUtils.delete_domain_ignore_doesnt_exist(client=keypoints_client, domain=domain)
keypoints_client.create_domain(domain=domain, domain_params={})

2023-04-02 18:17:23,450 [INFO] keypoints_client.py 48: client calls service (delete): https://keypoint-matching-backend.debater.res.ibm.com/data
2023-04-02 18:17:23,977 [ERROR] keypoints_client.py 62: There is a problem with the request (422): user: db0a12 doesn't have domain: austin_demo
2023-04-02 18:17:23,979 [INFO] KpAnalysisUtils.py 134: domain: austin_demo doesn't exist.
2023-04-02 18:17:23,980 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/domains
2023-04-02 18:17:24,899 [INFO] keypoints_client.py 99: created domain: austin_demo with domain_params: {}


Few domain related points:
* We can always delete a domain we no longer need using: **KpAnalysisUtils.delete_domain_ignore_doesnt_exist(client=keypoints_client, domain=domain)**
* Keep in mind that a domain has a state. It stores all comments that had beed uploaded into it and a cache with all calculations performed over this data.
* If we want to restart and run over the domain from scratch (no comments and no cache), we can delete the domain and then re-create it or obviously use a different domain. Keep in mind that the cache is also cleared and consecutive runs will take longer.

### 1.3 Upload comments into the domain
Upload the comments into the domain using the **keypoints_client.upload_comments(domain=domain, comments_ids=comments_ids, comments_texts=comments_texts)** method. This method receives the domain, a list of comment_ids and a list of comment_texts. When uploading comments into a domain, the *Key Point Analysis* service splits the comments into sentences and runs a minor cleansing on the sentences. If you have domain-specific knowladge and want to split the comments into sentences yourself, you can upload comments that are already splitted into sentences and set the *dont_split* parameter to True (in the domain_params when creating the domain) and *Key Point Analysis* will use the provided sentences as is. 

Note that:
* Comments_ids must be unique
* The number of comments_ids must match the number comments_texts
* Comments_texts must not be longer than 1000 characters
* Uploading the same comment several times (same domain + comment_id, comment_text is ignored) is not a problem and the comment is only uploaded once (if the comment_text is different, it is NOT updated).

In [41]:
comments_texts = [comment['text'] for comment in comments_2016_sample]
comments_ids = [comment['id'] for comment in comments_2016_sample]
keypoints_client.upload_comments(domain=domain, comments_ids=comments_ids, comments_texts=comments_texts)

2023-04-02 18:17:31,528 [INFO] keypoints_client.py 120: uploading 400 comments in batches
2023-04-02 18:17:31,529 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/comments
2023-04-02 18:17:32,500 [INFO] keypoints_client.py 134: uploaded 400 comments, out of 400


### 1.4 Wait for the comments to be processed
Comments that are uploaded to the domain are being processed. This takes some times and runs in an async manner. We can't run an analysis before this phase finishes and we need to wait till all comments in the domain are processed using the **keypoints_client.wait_till_all_comments_are_processed(domain=domain)** method.

In [42]:
keypoints_client.wait_till_all_comments_are_processed(domain=domain)

2023-04-02 18:17:40,771 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/comments
2023-04-02 18:17:41,424 [INFO] keypoints_client.py 146: domain: austin_demo, comments status: {'processed_comments': 0, 'processed_sentences': 0, 'pending_comments': 400}
2023-04-02 18:17:51,431 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/comments
2023-04-02 18:17:52,055 [INFO] keypoints_client.py 146: domain: austin_demo, comments status: {'processed_comments': 0, 'processed_sentences': 0, 'pending_comments': 400}
2023-04-02 18:18:02,063 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/comments
2023-04-02 18:18:02,688 [INFO] keypoints_client.py 146: domain: austin_demo, comments status: {'processed_comments': 400, 'processed_sentences': 681, 'pending_comments': 0}


### 1.5 Start a Key Point Analysis job
Start a *Key Point Analysis* job using the **future = keypoints_client.start_kp_analysis_job(domain=domain, run_params=run_params)** method. This method receives the domain and a *run_params*. The run_params is a dictionary with various parameters for customizing the job. Leaving it empty gives us a good default behaviour. The job runs in an async manner therefore the method returns a future object.

Few additional options when running an analysis job:
* The analysis is performed over all comments in the domain. If we need to run over a subset of the comments (split the data by different GEOs/users types/timeframes etc') we can pass a list of comments_ids to the comments_ids parameter and it will create an analysis using only the provided comments.
* By default, key points are extracted automatically. When we want to provide key points and match all sentences to these key points we can do so by passing them to the keypoints parameter: **run_param['keypoints'] = [...]**. This enables a mode of work named human-in-the-loop where we first automatically extract key points, then we manually edit them (refine non-perfect key points, remove duplicated and add missing ones) and then run again, this time providing the edited keypoints as a given set of key points.
* It is also possible to provide key points and let KPA add additional missing key points. To do so pass the key points to the keypoint_candidates parameter: **run_param['keypoint_candidates'] = [...]** (see section 4 for an elaborated example).
* Full documentation of the supported *domain_params* and *run_params* and how they affect the analysis can be found [here](kpa_parameters.pdf).

In [43]:
future = keypoints_client.start_kp_analysis_job(domain=domain, run_params={})

2023-04-02 18:18:11,863 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:18:12,537 [INFO] keypoints_client.py 202: started a kp analysis job - domain: austin_demo, run_params: {}, job_id: 64299cb494332d1aa92398dc


### 1.6 Wait for the Key Point Analysis job to finish
Use the returned future and wait till results are available using the **kpa_result = future.get_result()** method. The method waits for the job to finish and eventually returns the result. The result is a dictionary containing the key points (sorted descendingly according to number of matched sentences) and for each key point has a list of matched sentences (sorted descendingly according to their match score). An additional 'none' key point is added which holds all the sentences that don't match any key point.

In [44]:
kpa_result_2016 = future.get_result(high_verbosity=True, polling_timout_secs=30)

2023-04-02 18:18:21,153 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:18:21,764 [INFO] keypoints_client.py 387: job_id 64299cb494332d1aa92398dc is running, progress: not updated yet
2023-04-02 18:18:51,771 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:18:52,442 [INFO] keypoints_client.py 387: job_id 64299cb494332d1aa92398dc is running, progress: {'total_stages': 2, 'stage_1': {'inferred_batches': 6, 'total_batches': 6, 'batch_size': 2000}}


Stage 1/2: |██████████████████████████████████████████████████| 100.0% Complete




2023-04-02 18:19:22,449 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:19:23,804 [INFO] keypoints_client.py 390: job_id 64299cb494332d1aa92398dc is done, returning result


Let's print the results:

In [45]:
KpAnalysisUtils.print_result(kpa_result_2016, n_sentences_per_kp=2, title='2016 Random sample')

2016 Random sample coverage: 44.26
2016 Random sample key points:
73 - Improve affordable housing/living.
	- Cost of living (housing) more expensive than Chicago & traffic is worse.
	- It is also a major contributing factor to the affordability problem in Austin.
41 - Develop public transportation network.
	- affordable housing in key and public transportation to reduce the number of cars on the
	  roads
	- Austin needs to get serious about alternatives to driving including real mass transit,
	  bike, and pedestrian facilities.
30 - NEED BETTER TRAFFIC FLOW PLANNING.
	- Especially downtown with the huge amount of high rise condos being built, need some
	  major traffic changes made.
	- Improvement on traffic problem.
29 - TO HAVE BETTER PLANNING FOR CITY GROWTH.
	- More needs to be done to control the growth of the City.
	- Should have been better prepared for the city growth like Houston, San Antonio, Dallas.
27 - REDUCING TRAFFIC NEEDS TO BE PRIORITY.
	- Fix the traffic!!!
	- Fix the

We can also save the results to file. This creates two files, one with the key points and all matched sentences and another summary file with only the key points and their saliance.

In [46]:
KpAnalysisUtils.write_result_to_csv(kpa_result_2016, 'austin_survey_2016_kpa_results.csv')

2023-04-02 18:19:36,136 [INFO] utils.py 59: Writing dataframe to: austin_survey_2016_kpa_results.csv
2023-04-02 18:19:36,140 [INFO] utils.py 59: Writing dataframe to: austin_survey_2016_kpa_results_kps_summary.csv


It is always possible to cancel a pending/running job in the following way:
* **keypoints_client.cancel_kp_extraction_job(\<Job Id\>)**

Job Id can be found: 
1. It's printed when a job is started 
2. From the future object: **future.get_job_id()**
3. From user report: **keypoints_client.get_full_report()** (see bellow)

It is also possibe to stop all jobs in a domain, or even all jobs in all domains (might be simpler since there is no need of the job_id):
* **keypoints_client.cancel_all_extraction_jobs_for_domain(domain)**
* **keypoints_client.cancel_all_extraction_jobs_all_domains()**

Please cancel long jobs if the results are no longer needed.

### 1.7 Modify the run_params and increase coverage
Each domain has a cache that stores all intermediate results that are calculated during the analysis. Therefore modifing the run_params and running another analysis runs much faster and all intersecting intermediate results are retreived from cache. 

Let's run again, but now change the **mapping_policy**. The **mapping_policy** is used when mapping all sentences to the final key points: the default value is **NORMAL**. Changing to **STRICT** will cause only the sentence and key point pairs with very high matching confidence to be considered matched, increasing precision but potentially decreasing coverage. We will change it to **LOOSE**, which matches also sentences and key points with lower confidence, and is therefore expected to increase coverage at cost of precision. We will also increase the number of required key points to 100 using the **n_top_kps** parameter. 

In [47]:
run_params = {'mapping_policy':'LOOSE', 'n_top_kps': 100}
future = keypoints_client.start_kp_analysis_job(domain=domain, run_params=run_params)
kpa_result_2016 = future.get_result(high_verbosity=True, polling_timout_secs=30)
KpAnalysisUtils.write_result_to_csv(kpa_result_2016, 'austin_survey_2016_kpa_results.csv')
KpAnalysisUtils.print_result(kpa_result_2016, n_sentences_per_kp=2, title='Random sample')

2023-04-02 18:21:33,431 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:21:34,097 [INFO] keypoints_client.py 202: started a kp analysis job - domain: austin_demo, run_params: {'mapping_policy': 'LOOSE', 'n_top_kps': 100}, job_id: 64299d7e94332d1aa92398dd
2023-04-02 18:21:34,100 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:21:34,677 [INFO] keypoints_client.py 383: job_id 64299d7e94332d1aa92398dd is pending
2023-04-02 18:22:04,682 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:22:06,084 [INFO] keypoints_client.py 390: job_id 64299d7e94332d1aa92398dd is done, returning result
2023-04-02 18:22:06,130 [INFO] utils.py 59: Writing dataframe to: austin_survey_2016_kpa_results.csv
2023-04-02 18:22:06,134 [INFO] util

Random sample coverage: 68.30
Random sample key points:
73 - Improve affordable housing/living.
	- Cost of living (housing) more expensive than Chicago & traffic is worse.
	- It is also a major contributing factor to the affordability problem in Austin.
35 - Develop public transportation network.
	- affordable housing in key and public transportation to reduce the number of cars on the
	  roads
	- Austin needs to get serious about alternatives to driving including real mass transit,
	  bike, and pedestrian facilities.
32 - NEED BETTER TRAFFIC FLOW PLANNING.
	- Especially downtown with the huge amount of high rise condos being built, need some
	  major traffic changes made.
	- Improvement on traffic problem.
32 - REDUCING TRAFFIC NEEDS TO BE PRIORITY.
	- Fix the traffic!!!
	- Fix the traffic problem!
29 - TO HAVE BETTER PLANNING FOR CITY GROWTH.
	- More needs to be done to control the growth of the City.
	- Should have been better prepared for the city growth like Houston, San Antonio, 

By changing the mapping policy to **LOOSE** and increasing the number of key points, the coverage was increased from 44% to 68%.

### 1.8 User Report
When we want to see what domains we have, maybe delete old ones that are not needed, see past and present analysis jobs, perhaps take their job_id and fetch their result 
(via **KpAnalysisTaskFuture(keypoints_client, \<job_id\>).get_result()** ), 
we can get a report with all the needed information

In [None]:
report = keypoints_client.get_full_report()
KpAnalysisUtils.print_report(report)

## 2. Mapping sentences to multiple key points, and creating a Key-Points-Graphs
By default, each sentence is mapped to one key point at most (the key point with the highest match-score, that follows the **mapping_policy**). We can run again and ask KPA to map each sentence to all key points that are matched according to the **mapping_policy**, by adding the **sentence_to_multiple_kps** parameter.

In [48]:
run_params = {'sentence_to_multiple_kps': True, 'n_top_kps': 100}
future = keypoints_client.start_kp_analysis_job(domain=domain, run_params=run_params)
kpa_2016_job_id = future.get_job_id() # saving the job_id for a following section
kpa_result_2016 = future.get_result(high_verbosity=True, polling_timout_secs=30)

2023-04-02 18:22:16,024 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:22:16,680 [INFO] keypoints_client.py 202: started a kp analysis job - domain: austin_demo, run_params: {'sentence_to_multiple_kps': True, 'n_top_kps': 100}, job_id: 64299da894332d1aa92398de
2023-04-02 18:22:16,681 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:22:17,250 [INFO] keypoints_client.py 383: job_id 64299da894332d1aa92398de is pending
2023-04-02 18:22:47,256 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:22:48,806 [INFO] keypoints_client.py 390: job_id 64299da894332d1aa92398de is done, returning result


In [49]:
KpAnalysisUtils.print_result(kpa_result_2016, n_sentences_per_kp=2, title='Random sample')

Random sample coverage: 58.35
Random sample key points:
76 - Improve affordable housing/living.
	- Cost of living (housing) more expensive than Chicago & traffic is worse.
	- It is also a major contributing factor to the affordability problem in Austin.
61 - This city needs more motor vehicle traffic lanes!!
	- Fix the traffic problems, forget bikes and add more lanes.
	- our roads are getting wider and we need less cars.
49 - REDUCING TRAFFIC NEEDS TO BE PRIORITY.
	- Fix the traffic!!!
	- Fix the traffic problem!
42 - Develop public transportation network.
	- affordable housing in key and public transportation to reduce the number of cars on the
	  roads
	- I really want to see a train system in Austin that connects from Bee Cave Road and other
	  points into downtown.
38 - TO HAVE BETTER PLANNING FOR CITY GROWTH.
	- More needs to be done to control the growth of the City.
	- Should have been better prepared for the city growth like Houston, San Antonio, Dallas.
34 - The highways need

Now that sentences are mapped to multiple key points, it is possible to create a *key points graph* by first saving the results as before, then translating the results file into a graph-data json file, then load this json file in our demo graph visualization, available at: [key points graph demo](https://keypoint-matching-ui.ris2-debater-event.us-east.containers.appdomain.cloud/)

In [50]:
KpAnalysisUtils.write_result_to_csv(kpa_result_2016, 'austin_survey_2016_multiple_kpa_results.csv')
KpAnalysisUtils.generate_graphs_and_textual_summary('austin_survey_2016_multiple_kpa_results.csv')

2023-04-02 18:23:36,934 [INFO] utils.py 59: Writing dataframe to: austin_survey_2016_multiple_kpa_results.csv
2023-04-02 18:23:36,940 [INFO] utils.py 59: Writing dataframe to: austin_survey_2016_multiple_kpa_results_kps_summary.csv
2023-04-02 18:23:36,998 [INFO] KpAnalysisUtils.py 261: saving graph in file: austin_survey_2016_multiple_kpa_results_graph_data.json
2023-04-02 18:23:37,000 [INFO] KpAnalysisUtils.py 261: saving graph in file: austin_survey_2016_multiple_kpa_results_hierarchical_graph_data.json
2023-04-02 18:23:37,003 [INFO] KpAnalysisUtils.py 306: saving textual bullets in file: austin_survey_2016_multiple_kpa_results_hierarchical_bullets.txt
2023-04-02 18:23:37,021 [INFO] docx_generator.py 216: Creating key points hierarchy
2023-04-02 18:23:37,035 [INFO] docx_generator.py 222: Creating key points matches tables
2023-04-02 18:23:37,037 [INFO] docx_generator.py 243: creating table for KP: Improve affordable housing/living., n_matches: 50
2023-04-02 18:23:37,055 [INFO] docx_g

**generate_graphs_and_textual_summary** creates 4 files:
* **austin_survey_2016_multiple_kpa_results_graph_data.json**: a graph_data file that can be loaded to: [key points graph demo](https://keypoint-matching-ui.ris2-debater-event.us-east.containers.appdomain.cloud/). It presents the relations between the key points as a graph of key points.
* **austin_survey_2016_multiple_kpa_results_hierarchical_graph_data.json**: another graph_data file that can be loaded to the graph-demo-site. This graph is simplified, it's more convenient to extract insights from it.
* **austin_survey_2016_multiple_kpa_results_hierarchical.txt**: This textual file shows the simplified graph (from the previous bullet) as a list of hierarchical bullets.
* **austin_survey_2016_multiple_kpa_results_hierarchical.docx**: This Microsoft Word document shows the textual bullets (from the previous bullet) as a user-friendly report.

## 3. Run *Key Point Analysis* incrementally
### 3.1 Run *Key Point Analysis* incrementally on new data (data from 2016 + 2017)
A year passed, and we collect additional data (data from 2017). We can now upload the 2017 data to the same domain (austin_demo) and have both 2016 and 2017 data in one domain. 

In [51]:
comments_2017 = [c for c in comments if c['year'] == '2017']
random.seed(0)
comments_2017_sample = random.sample(comments_2017, sample_size)

domain = 'austin_demo'
comments_texts = [comment['text'] for comment in comments_2017_sample]
comments_ids = [comment['id'] for comment in comments_2017_sample]
keypoints_client.upload_comments(domain=domain, comments_ids=comments_ids, comments_texts=comments_texts)
keypoints_client.wait_till_all_comments_are_processed(domain=domain)

2023-04-02 18:23:50,176 [INFO] keypoints_client.py 120: uploading 400 comments in batches
2023-04-02 18:23:50,177 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/comments
2023-04-02 18:23:51,132 [INFO] keypoints_client.py 134: uploaded 400 comments, out of 400
2023-04-02 18:23:51,134 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/comments
2023-04-02 18:23:51,770 [INFO] keypoints_client.py 146: domain: austin_demo, comments status: {'processed_comments': 400, 'processed_sentences': 681, 'pending_comments': 400}
2023-04-02 18:24:01,775 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/comments
2023-04-02 18:24:02,516 [INFO] keypoints_client.py 146: domain: austin_demo, comments status: {'processed_comments': 400, 'processed_sentences': 681, 'pending_comments': 400}
2023-04-02 18:24:12,519 [INFO] 

We can now run a new analysis over all the data in the domain, as we did before, and automatically extract new key points. We can assume that some will be identical to the key points extracted on the 2016 data, some will be similar and some key points will be new.

A better option is to run a new analysis but provide the keypoints from the 2016 analysis and let *Key Point Analysis* add new key points of 2017 data if there are such. One benefit of this approach is that the new result will mostly use 2016 key point and we will be able to compare between them, see what changed, what improved and what not. Another major benefit for this approach is run-time. 2016 data was already analyzed with these key points and since we have a cache in place much of the computation can be avoided. The 2016 key points can be provided via the: **run_param['keypoint_candidates'] = [...]** parameter, passing a list of strings, or we can use: **run_param['keypoint_candidates_by_job_id'] = <job_id>** and provide the job_id of an analysis job. KPA will take the key points from the job's result automatically. We will use this parameter and provide the *kpa_2016_job_id* we saved in advance.

In [52]:
run_params = {'sentence_to_multiple_kps': True,
              'keypoint_candidates_by_job_id': kpa_2016_job_id, 'n_top_kps': 100}
future = keypoints_client.start_kp_analysis_job(domain=domain, run_params=run_params)
kpa_result_2016_2017 = future.get_result(high_verbosity=True, polling_timout_secs=30)

2023-04-02 18:24:31,117 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:24:31,843 [INFO] keypoints_client.py 202: started a kp analysis job - domain: austin_demo, run_params: {'sentence_to_multiple_kps': True, 'keypoint_candidates_by_job_id': '64299da894332d1aa92398de', 'n_top_kps': 100}, job_id: 64299e2f94332d1aa92398e0
2023-04-02 18:24:31,845 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:24:32,412 [INFO] keypoints_client.py 383: job_id 64299e2f94332d1aa92398e0 is pending
2023-04-02 18:25:02,419 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:25:03,087 [INFO] keypoints_client.py 387: job_id 64299e2f94332d1aa92398e0 is running, progress: {'total_stages': 3, 'stage_0': {'inferred_batches': 3, 'total_batches': 

Stage 1/3: |--------------------------------------------------| 0.0% Complete



2023-04-02 18:26:03,724 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:26:05,560 [INFO] keypoints_client.py 390: job_id 64299e2f94332d1aa92398e0 is done, returning result


In [54]:
KpAnalysisUtils.write_result_to_csv(kpa_result_2016_2017, 'austin_survey_2016_2017_kpa_results.csv')
KpAnalysisUtils.compare_results(kpa_result_2016, kpa_result_2016_2017, '2016', '2016 + 2017')

2023-04-02 18:26:05,843 [INFO] utils.py 59: Writing dataframe to: austin_survey_2016_2017_kpa_results.csv
2023-04-02 18:26:05,851 [INFO] utils.py 59: Writing dataframe to: austin_survey_2016_2017_kpa_results_kps_summary.csv


Unnamed: 0,key point,2016_n_sents,2016_percent,2016 + 2017_n_sents,2016 + 2017_percent,change_n_sents,change_percent
0,Improve affordable housing/living.,76,11.64%,181,13.13%,105,1.50%
1,This city needs more motor vehicle traffic lan...,61,9.34%,125,9.07%,64,-0.27%
2,REDUCING TRAFFIC NEEDS TO BE PRIORITY.,49,7.50%,106,7.69%,57,0.19%
3,Develop public transportation network.,42,6.43%,84,6.10%,42,-0.34%
4,TO HAVE BETTER PLANNING FOR CITY GROWTH.,38,5.82%,58,4.21%,20,-1.61%
5,The highways need a major overhaul.,34,5.21%,70,5.08%,36,-0.13%
6,Stop raising property taxes!,31,4.75%,80,5.81%,49,1.06%
7,"Utilities, particularly water, is too high.",30,4.59%,61,4.43%,31,-0.17%
8,Don't let Austin become Houston with overdevel...,24,3.68%,35,2.54%,11,-1.14%
9,Make improvements to public transportation in ...,23,3.52%,53,3.85%,30,0.32%


### 3.2 Run *Key Point Analysis* incrementaly on new data (2017 independantly)
Using the **comments_ids** parameter in **start_kp_analysis_job** method, we can run over a subset of the comments in the domain. Let's do that and run an analysis over 2017 comments independantly. We will provide the key points from 2016 since we want to able to compare between them:

In [55]:
comments_ids = [comment['id'] for comment in comments_2017_sample]
run_params = {'sentence_to_multiple_kps': True,
              'keypoint_candidates_by_job_id': kpa_2016_job_id, 'n_top_kps': 100}
future = keypoints_client.start_kp_analysis_job(comments_ids=comments_ids, domain=domain, run_params=run_params)
kpa_result_2017 = future.get_result(high_verbosity=True, polling_timout_secs=30)

KpAnalysisUtils.write_result_to_csv(kpa_result_2017, 'austin_survey_2017_kpa_results.csv')

2023-04-02 18:30:02,300 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:30:03,287 [INFO] keypoints_client.py 202: started a kp analysis job - domain: austin_demo, run_params: {'sentence_to_multiple_kps': True, 'keypoint_candidates_by_job_id': '64299da894332d1aa92398de', 'n_top_kps': 100}, job_id: 64299f7b94332d1aa92398e1
2023-04-02 18:30:03,290 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:30:03,893 [INFO] keypoints_client.py 383: job_id 64299f7b94332d1aa92398e1 is pending
2023-04-02 18:30:33,900 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:30:34,575 [INFO] keypoints_client.py 387: job_id 64299f7b94332d1aa92398e1 is running, progress: {'total_stages': 3, 'stage_0': {'inferred_batches': 4, 'total_batches': 

Stage 1/3: |█████████████████████████-------------------------| 50.0% Complete



2023-04-02 18:31:35,193 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:31:35,859 [INFO] keypoints_client.py 387: job_id 64299f7b94332d1aa92398e1 is running, progress: {'total_stages': 3, 'stage_0': {'inferred_batches': 4, 'total_batches': 4, 'batch_size': 2000}, 'stage_1': {'inferred_batches': 2, 'total_batches': 2, 'batch_size': 2000}}


Stage 1/3: |██████████████████████████████████████████████████| 100.0% Complete




2023-04-02 18:32:05,862 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:32:07,391 [INFO] keypoints_client.py 390: job_id 64299f7b94332d1aa92398e1 is done, returning result
2023-04-02 18:32:07,454 [INFO] utils.py 59: Writing dataframe to: austin_survey_2017_kpa_results.csv
2023-04-02 18:32:07,459 [INFO] utils.py 59: Writing dataframe to: austin_survey_2017_kpa_results_kps_summary.csv


In [56]:
KpAnalysisUtils.compare_results(kpa_result_2016, kpa_result_2017, '2016', '2017')

Unnamed: 0,key point,2016_n_sents,2016_percent,2017_n_sents,2017_percent,change_n_sents,change_percent
0,Improve affordable housing/living.,76,11.64%,102,14.07%,26,2.43%
1,This city needs more motor vehicle traffic lan...,61,9.34%,60,8.28%,-1,-1.07%
2,REDUCING TRAFFIC NEEDS TO BE PRIORITY.,49,7.50%,49,6.76%,0,-0.75%
3,Develop public transportation network.,42,6.43%,35,4.83%,-7,-1.60%
4,TO HAVE BETTER PLANNING FOR CITY GROWTH.,38,5.82%,20,2.76%,-18,-3.06%
5,The highways need a major overhaul.,34,5.21%,29,4.00%,-5,-1.21%
6,Stop raising property taxes!,31,4.75%,47,6.48%,16,1.74%
7,"Utilities, particularly water, is too high.",30,4.59%,27,3.72%,-3,-0.87%
8,Don't let Austin become Houston with overdevel...,24,3.68%,7,0.97%,-17,-2.71%
9,Make improvements to public transportation in ...,23,3.52%,26,3.59%,3,0.06%


Running over subsets of the data in the domain enable us to compare results between them (subsets can be data from different GEOs, different organizations, different users (e.g. promoters/detractors) etc').

## 4. Run *Key Point Analysis* on each stance separately
In many use-cases (surveys, customer feedback, etc') the comments have positive and/or negative stance, and it is usful to create a KPA analysis on each stance seperatly. Most stance detection models don't perfome too well on survey data (also costumer feedbacks etc') since the comments tend to have many "suggestions" in them, and the suggestions tend to apear positive to the model while the user suggests to improve something that needs improvement.
For that end we trained a stance-model that handles suggestions well and labels each sentence as 'Positive', 'Negative', 'Neutral' and 'Suggestion'. We usually treat Suggestions as negatives and run two separate analysis, first over 'Positive' sentences and second over 'Negative' and 'Suggestions' sentences.

This has the following advantages:
* Creates a separate positive/negative summary that shows clearly what works well and what needs to be improved.
* Filters-out neutral sentences that usually don't contain valuable information.
* Helps the matching model avoid stance mistakes (matching a positive sentence to a negative key point and vice-versa).

Lets run again, over the Austin survey dataset, but this time create two seperate KPA analyses (positive and negative). We will first need to create a new domain and add the domain_param **do_stance_analysis**.

In [57]:
domain = 'austin_demo_two_stances'
domain_params = {'do_stance_analysis': True}
KpAnalysisUtils.delete_domain_ignore_doesnt_exist(client=keypoints_client, domain=domain)
keypoints_client.create_domain(domain=domain, domain_params=domain_params)

2023-04-02 18:33:24,693 [INFO] keypoints_client.py 48: client calls service (delete): https://keypoint-matching-backend.debater.res.ibm.com/data
2023-04-02 18:33:25,235 [ERROR] keypoints_client.py 62: There is a problem with the request (422): user: db0a12 doesn't have domain: austin_demo_two_stances
2023-04-02 18:33:25,236 [INFO] KpAnalysisUtils.py 134: domain: austin_demo_two_stances doesn't exist.
2023-04-02 18:33:25,237 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/domains
2023-04-02 18:33:26,125 [INFO] keypoints_client.py 99: created domain: austin_demo_two_stances with domain_params: {'do_stance_analysis': True}


Let's upload all 2016 comments to the new domain and wait for them to be processed. This time the sentences' stance is also calculated.

In [58]:
comments_texts = [comment['text'] for comment in comments_2016]
comments_ids = [comment['id'] for comment in comments_2016]
keypoints_client.upload_comments(domain=domain, comments_ids=comments_ids, comments_texts=comments_texts)
keypoints_client.wait_till_all_comments_are_processed(domain=domain)

2023-04-02 18:33:33,825 [INFO] keypoints_client.py 120: uploading 1588 comments in batches
2023-04-02 18:33:33,826 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/comments
2023-04-02 18:33:35,063 [INFO] keypoints_client.py 134: uploaded 1588 comments, out of 1588
2023-04-02 18:33:35,066 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/comments
2023-04-02 18:33:35,715 [INFO] keypoints_client.py 146: domain: austin_demo_two_stances, comments status: {'processed_comments': 0, 'processed_sentences': 0, 'pending_comments': 1588}
2023-04-02 18:33:45,721 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/comments
2023-04-02 18:33:46,340 [INFO] keypoints_client.py 146: domain: austin_demo_two_stances, comments status: {'processed_comments': 0, 'processed_sentences': 0, 'pending_comments': 1588}
2023-04-02

We can download the processed sentences and save them into a csv if we want to examine the processed data.

In [59]:
sentences = keypoints_client.get_sentences_for_domain(domain=domain)
KpAnalysisUtils.write_sentences_to_csv(sentences, f'{domain}_sentences.csv')

2023-04-02 18:34:18,305 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/data
2023-04-02 18:34:20,391 [INFO] keypoints_client.py 333: returning 2707 sentences for domain austin_demo_two_stances


And now, run two analyses, one over the positive sentences and one over the negative + suggestions.

In [60]:
run_params = {'sentence_to_multiple_kps': True, "n_top_kps":100}
run_params['stances_to_run'] = ['pos']
run_params['stances_threshold'] = 0.5
future = keypoints_client.start_kp_analysis_job(domain=domain, run_params=run_params)
kpa_pos_result = future.get_result(high_verbosity=True, polling_timout_secs=30)
KpAnalysisUtils.print_result(kpa_pos_result, n_sentences_per_kp=2, title='Random sample positives')

2023-04-02 18:34:21,199 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:34:21,986 [INFO] keypoints_client.py 202: started a kp analysis job - domain: austin_demo_two_stances, run_params: {'sentence_to_multiple_kps': True, 'n_top_kps': 100, 'stances_to_run': ['pos'], 'stances_threshold': 0.5}, job_id: 6429a07d94332d1aa92398e3
2023-04-02 18:34:21,987 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:34:22,606 [INFO] keypoints_client.py 383: job_id 6429a07d94332d1aa92398e3 is pending
2023-04-02 18:34:52,612 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:34:53,804 [INFO] keypoints_client.py 390: job_id 6429a07d94332d1aa92398e3 is done, returning result


Random sample positives coverage: 19.51
Random sample positives key points:
7 - I think you're doing a great job!
	- Whoever you are, you're doing great.
	- I believe he is doing a fine job.
5 - Keep up the good work!
	- Keep up the good work!
	- Keep up the good work!
4 - Keep it up!
	- Keep doing your JOB
	- KEEP UP THE GOOD WORK, MR MAYOR
4 - This holds up traffic tremendously.
	- Improvement on traffic problem.
	- Austin is a wonderful place to live except of course the traffic.
3 - CONTINUED SUPPORT FOR N HOOD PARKS AND POOLS.
	- Please keep adding bike lanes and please protect our parks and historical sites.
	- Our lakes are full.
3 - City services (water, streets, electric) are outstanding!!
	- When I moved to Austin city streets were clean, mowing done, water run off clean for
	  drainage.
	- RESIDENTIAL SERVICES ARE EXCELLENT!


As in many surveys, most comments are negative/suggestions therefore the positive analysis is relativly limited. Let's see how the negative analysis goes.

In [61]:
run_params['stances_to_run'] = ['neg', 'sug']
run_params['stances_threshold'] = 0.5
future = keypoints_client.start_kp_analysis_job(domain=domain, run_params=run_params, comments_ids=comments_ids)
kpa_neg_result = future.get_result(high_verbosity=True, polling_timout_secs=30)

2023-04-02 18:35:00,206 [INFO] keypoints_client.py 48: client calls service (post): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:35:01,677 [INFO] keypoints_client.py 202: started a kp analysis job - domain: austin_demo_two_stances, run_params: {'sentence_to_multiple_kps': True, 'n_top_kps': 100, 'stances_to_run': ['neg', 'sug'], 'stances_threshold': 0.5}, job_id: 6429a0a594332d1aa92398e4
2023-04-02 18:35:01,678 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:35:02,267 [INFO] keypoints_client.py 383: job_id 6429a0a594332d1aa92398e4 is pending
2023-04-02 18:35:32,273 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:35:32,920 [INFO] keypoints_client.py 387: job_id 6429a0a594332d1aa92398e4 is running, progress: {'total_stages': 2, 'stage_1': {'inferred_batches': 0, 'total

Stage 1/2: |--------------------------------------------------| 0.0% Complete



2023-04-02 18:36:02,927 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:36:03,645 [INFO] keypoints_client.py 387: job_id 6429a0a594332d1aa92398e4 is running, progress: {'total_stages': 2, 'stage_1': {'inferred_batches': 8, 'total_batches': 20, 'batch_size': 2000}}


Stage 1/2: |████████████████████------------------------------| 40.0% Complete



2023-04-02 18:36:33,652 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:36:34,326 [INFO] keypoints_client.py 387: job_id 6429a0a594332d1aa92398e4 is running, progress: {'total_stages': 2, 'stage_1': {'inferred_batches': 20, 'total_batches': 20, 'batch_size': 2000}}


Stage 1/2: |██████████████████████████████████████████████████| 100.0% Complete




2023-04-02 18:37:04,331 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:37:05,004 [INFO] keypoints_client.py 387: job_id 6429a0a594332d1aa92398e4 is running, progress: {'total_stages': 2, 'stage_1': {'inferred_batches': 20, 'total_batches': 20, 'batch_size': 2000}, 'stage_2': {'inferred_batches': 0, 'total_batches': 3, 'batch_size': 2000}}


Stage 2/2: |--------------------------------------------------| 0.0% Complete



2023-04-02 18:37:35,010 [INFO] keypoints_client.py 48: client calls service (get): https://keypoint-matching-backend.debater.res.ibm.com/kp_extraction
2023-04-02 18:37:37,329 [INFO] keypoints_client.py 390: job_id 6429a0a594332d1aa92398e4 is done, returning result


Lets print the results:

In [62]:
KpAnalysisUtils.print_result(kpa_neg_result, n_sentences_per_kp=2, title='Random sample negatives')

Random sample negatives coverage: 67.04
Random sample negatives key points:
348 - Traffic congestion needs major improvement
	- I really wish that city planning would find a way to improve traffic flow.
	- Also, the traffic in Austin is ridiculous and the lack of public transportation needs
	  improvements.
256 - We really need to improve our public transportation.
	- Need to rework zoning to improve traffic and add more public transit, especially to
	  suburbs.
	- The city needs a more comprehensive and better connected transportation network
	  including bus, rail, bike, pedestrian access.
239 - Improve affordable housing/living.
	- we need more affordable housing opportunities so that we don't become a city of only
	  wealthy people
	- Affordable housing is also a problem- the rents are getting too high and are hard for
	  families to afford.
238 - Overall living costs in Austin are too high!
	- Austin has become too expensive and wages have been stagnant for too long.
	- The cost o

Reaching a nice 67% coverage, most of the sentences are matched to the 100 automatically extracted key points.

We can increase the stances_threshold when we want to run over less sentences with a stronger stance. This is useful when we have a large dataset with many less-relevant sentences and we want to filter them out.

We can mark the stance in the results:

In [63]:
kpa_pos_result = KpAnalysisUtils.set_stance_to_result(kpa_pos_result, 'pos')
kpa_neg_result = KpAnalysisUtils.set_stance_to_result(kpa_neg_result, 'neg')

And save the results (both pos/neg seperatly and merged) and create key points graphs' data files as we did before

In [64]:
pos_result_file = 'austin_survey_2016_pro_kpa_results.csv'
KpAnalysisUtils.write_result_to_csv(kpa_pos_result, pos_result_file)
KpAnalysisUtils.generate_graphs_and_textual_summary(pos_result_file)

neg_result_file = 'austin_survey_2016_neg_kpa_results.csv'
KpAnalysisUtils.write_result_to_csv(kpa_neg_result, neg_result_file)
KpAnalysisUtils.generate_graphs_and_textual_summary(neg_result_file)

kpa_merged_result = KpAnalysisUtils.merge_two_results(kpa_pos_result, kpa_neg_result)
merged_result_file = 'austin_survey_2016_merged_kpa_results.csv'
KpAnalysisUtils.write_result_to_csv(kpa_merged_result, merged_result_file)
KpAnalysisUtils.generate_graphs_and_textual_summary(merged_result_file)

2023-04-02 18:39:44,486 [INFO] utils.py 59: Writing dataframe to: austin_survey_2016_pro_kpa_results.csv
2023-04-02 18:39:44,489 [INFO] utils.py 59: Writing dataframe to: austin_survey_2016_pro_kpa_results_kps_summary.csv
2023-04-02 18:39:44,509 [INFO] KpAnalysisUtils.py 261: saving graph in file: austin_survey_2016_pro_kpa_results_graph_data.json
2023-04-02 18:39:44,510 [INFO] KpAnalysisUtils.py 261: saving graph in file: austin_survey_2016_pro_kpa_results_hierarchical_graph_data.json
2023-04-02 18:39:44,511 [INFO] KpAnalysisUtils.py 306: saving textual bullets in file: austin_survey_2016_pro_kpa_results_hierarchical_bullets.txt
2023-04-02 18:39:44,524 [INFO] docx_generator.py 216: Creating key points hierarchy
2023-04-02 18:39:44,526 [INFO] docx_generator.py 222: Creating key points matches tables
2023-04-02 18:39:44,527 [INFO] docx_generator.py 243: creating table for KP: I think you're doing a great job!, n_matches: 7
2023-04-02 18:39:44,531 [INFO] docx_generator.py 243: creating t

We can also use the incremental approach when running on each stance seperatly. We will need to provide the job_id of the positive analysis of 2016 when running on the positive sentences of 2016 + 2017 and the job_id of negative analysis of 2016 when running on the negative sentences of 2016 + 2017, but for simplicity reasons, we didn't combine the features in this tutorial.

## 5. Cleanup
If you finished the tutorial and no longer need the domains and the results, cleaning up is always advised:

In [65]:
KpAnalysisUtils.delete_domain_ignore_doesnt_exist(client=keypoints_client, domain='austin_demo')
KpAnalysisUtils.delete_domain_ignore_doesnt_exist(client=keypoints_client, domain='austin_demo_two_stances')

2023-04-02 18:40:10,112 [INFO] keypoints_client.py 48: client calls service (delete): https://keypoint-matching-backend.debater.res.ibm.com/data
2023-04-02 18:40:11,386 [INFO] KpAnalysisUtils.py 130: domain: austin_demo was deleted
2023-04-02 18:40:11,389 [INFO] keypoints_client.py 48: client calls service (delete): https://keypoint-matching-backend.debater.res.ibm.com/data
2023-04-02 18:40:12,803 [INFO] KpAnalysisUtils.py 130: domain: austin_demo_two_stances was deleted


## 6. Conclusion
In this tutorial, we showed how to use the *Key Point Analysis* service, and how it provides detailed insights over survey data right out of the box - significantly reducing the effort required by a data scientist to analyze the data. We also demonstrated key *key point analysis* features such as how to modify the analysis parameters and increase coverage, how to use the stance-model and create per-stance results, how to create *key points graph* and further improve the quality and the clarity of the results, and how to incrementally add new data.

Feel free to contact us for questions or assistance: *yoavka@il.ibm.com*