### Topic Modelling is the task of using unsupervised learning to extract the main topics (represented as a set of words) that occur in a collection of documents.

## About LDA

- LDA is used to classify text in a document to a particular topic. It builds a topic per document model and words per topic model, modeled as Dirichlet distributions.
- Each document is modeled as a multinomial distribution of topics and each topic is modeled as a multinomial distribution of words.
- LDA assumes that the every chunk of text we feed into it will contain words that are somehow related. Therefore choosing the right corpus of data is crucial.
- It also assumes documents are produced from a mixture of topics. Those topics then generate words based on their probability distribution.

In [69]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
from sklearn.feature_extraction.text import TfidfVectorizer,CountVectorizer
from sklearn import decomposition
import re
import nltk
from nltk.stem.porter import PorterStemmer
from sklearn.model_selection import train_test_split
import warnings
warnings.filterwarnings('ignore')

In [70]:
nltk.download('punkt')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\GCS\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

In [71]:
df = pd.read_csv('D:/AppliedAICourse/Projects/NLP/end to end topic modelling/consumer_compliants.csv')
df.head(3)

Unnamed: 0,Date received,Product,Sub-product,Issue,Sub-issue,Consumer complaint narrative,Company public response,Company,State,ZIP code,Tags,Consumer consent provided?,Submitted via,Date sent to company,Company response to consumer,Timely response?,Consumer disputed?,Complaint ID
0,4/3/2020,Vehicle loan or lease,Loan,Getting a loan or lease,Fraudulent loan,"This auto loan was opened on XX/XX/2020 in XXXX, NC with BB & T in my name. I have NEVER been to North Carolina and I have NEVER been a resident. I have filed a dispute twice through my credit bureaus but both times BB & T has claimed that this is an accurate loan. Which I wasn't aware of until today. I have tried to contact BB & T multiple times but I have never gotten through to a live person. I do n't drive and I have never owned a car before. I didn't have any knowledge of this account until I checked XXXXXXXX XXXX and noticed it. I've tried twice to dispute it. Additionally I never received any bills or information about this account. This is my last resort in trying to remove this fraudulent loan off of my account.",Company has responded to the consumer and the CFPB and chooses not to provide a public response,TRUIST FINANCIAL CORPORATION,PA,,,Consent provided,Web,4/3/2020,Closed with explanation,Yes,,3591341
1,3/12/2020,Debt collection,Payday loan debt,Attempts to collect debt not owed,Debt is not yours,"In XXXX of 2019 I noticed a debt for {$620.00} on my credit which i believed was mine I thought speedy cash had bought one of my old debts and sold it to XXXX XXXX XXXX XXXX. I contacted XXXX XXXX XXXX XXXX and after several attempts of giving my full name, nothing came up in their system. I gave my social and the rep said the account popped up but DID NOT tell me that the account was under someone elses name and continued to let me make a payment. The payment was for {$120.00}. Confirmation number-XXXX. After realizing it was not my account, I called back to get my money back and inform them of the mistake. I was told i needed to mail them an FTC report and dispute letter to get my money back. I completed all of this and when i called again they said they transferred the account back to speedy cash for fraud review and I would need to contact them. After contacting them i was again told that i can not get my money back. The issue im having is this representative at XXXX XXXX played blind to obvious fraud and let an innocent person make a payment on someone elses debt and i want my money back.",,CURO Intermediate Holdings,CO,806XX,,Consent provided,Web,3/12/2020,Closed with explanation,Yes,,3564184
2,2/6/2020,Vehicle loan or lease,Loan,Getting a loan or lease,Credit denial,"As stated from Capital One, XXXX XX/XX/XXXX and XXXX 2018, My wife and I went to several car dealerships to request for a car loan to get a used car. However, according to their credit requirements unfortunately my credit score was insufficient for the car loan approval at that time. It seemed as though they pulled my credit report multiple times.",,CAPITAL ONE FINANCIAL CORPORATION,OH,430XX,,Consent provided,Web,2/6/2020,Closed with explanation,Yes,,3521949


In [72]:
df['Consumer complaint narrative'][0]

"This auto loan was opened on XX/XX/2020 in XXXX, NC with BB & T in my name. I have NEVER been to North Carolina and I have NEVER been a resident. I have filed a dispute twice through my credit bureaus but both times BB & T has claimed that this is an accurate loan. Which I wasn't aware of until today. I have tried to contact BB & T multiple times but I have never gotten through to a live person. I do n't drive and I have never owned a car before. I didn't have any knowledge of this account until I checked XXXXXXXX XXXX  and noticed it. I've tried twice to dispute it. Additionally I never received any bills or information about this account. This is my last resort in trying to remove this fraudulent loan off of my account."

In [73]:
df['Product'].value_counts()

Debt collection                21772
Credit card or prepaid card    13193
Mortgage                       9799 
Checking or savings account    7003 
Student loan                   2950 
Vehicle loan or lease          2736 
Name: Product, dtype: int64

In [74]:
df['Company'].value_counts()

CITIBANK, N.A.                           3226
CAPITAL ONE FINANCIAL CORPORATION        2711
BANK OF AMERICA, NATIONAL ASSOCIATION    2580
JPMORGAN CHASE & CO.                     2409
WELLS FARGO & COMPANY                    2001
                                         ... 
JAN (LendUSA) Holdings, Inc.             1   
Novea Portfolio Management LLC           1   
Accounts Receivable Services             1   
Fast Track Servicing                     1   
Financial Recoveries, Inc.               1   
Name: Company, Length: 2197, dtype: int64

In [75]:
complaints_df = df[['Consumer complaint narrative','Product','Company']].rename(columns={'Consumer complaint narrative':'complaints'})

In [76]:
complaints_df

Unnamed: 0,complaints,Product,Company
0,"This auto loan was opened on XX/XX/2020 in XXXX, NC with BB & T in my name. I have NEVER been to North Carolina and I have NEVER been a resident. I have filed a dispute twice through my credit bureaus but both times BB & T has claimed that this is an accurate loan. Which I wasn't aware of until today. I have tried to contact BB & T multiple times but I have never gotten through to a live person. I do n't drive and I have never owned a car before. I didn't have any knowledge of this account until I checked XXXXXXXX XXXX and noticed it. I've tried twice to dispute it. Additionally I never received any bills or information about this account. This is my last resort in trying to remove this fraudulent loan off of my account.",Vehicle loan or lease,TRUIST FINANCIAL CORPORATION
1,"In XXXX of 2019 I noticed a debt for {$620.00} on my credit which i believed was mine I thought speedy cash had bought one of my old debts and sold it to XXXX XXXX XXXX XXXX. I contacted XXXX XXXX XXXX XXXX and after several attempts of giving my full name, nothing came up in their system. I gave my social and the rep said the account popped up but DID NOT tell me that the account was under someone elses name and continued to let me make a payment. The payment was for {$120.00}. Confirmation number-XXXX. After realizing it was not my account, I called back to get my money back and inform them of the mistake. I was told i needed to mail them an FTC report and dispute letter to get my money back. I completed all of this and when i called again they said they transferred the account back to speedy cash for fraud review and I would need to contact them. After contacting them i was again told that i can not get my money back. The issue im having is this representative at XXXX XXXX played blind to obvious fraud and let an innocent person make a payment on someone elses debt and i want my money back.",Debt collection,CURO Intermediate Holdings
2,"As stated from Capital One, XXXX XX/XX/XXXX and XXXX 2018, My wife and I went to several car dealerships to request for a car loan to get a used car. However, according to their credit requirements unfortunately my credit score was insufficient for the car loan approval at that time. It seemed as though they pulled my credit report multiple times.",Vehicle loan or lease,CAPITAL ONE FINANCIAL CORPORATION
3,"Please see CFPB case XXXX. \n\nCapital One, in the letter they provided ( and attached to that case as their response ) said this : "" The funds were reversed and sent back to XXXX XXXX XXXX on XX/XX/XXXX ''. \n\nXXXX XXXX XXXX ( now XXXX XXXX ) has not received these funds. Staff at XXXX XXXX - and also staff at the account-holder 's business - have looked for return of my money ( {$650.00} ) and find nothing. \n\nCapital One needs to document - actually prove - they returned the funds, as stated in their letter. Capital One must provide electronic information, if the return was made that way, or document the paper check they sent back to XXXX XXXX. \n\nI've left 3 messages about this problem for the person who signed the letter ( XXXX ) from Capital One. I have received no call-backs. \n\nSummary : Capital One said they returned my money on XX/XX/XXXX : they did not. If they continue claim they did, then they need to prove that.",Checking or savings account,CAPITAL ONE FINANCIAL CORPORATION
4,"This debt was incurred due to medical malpractice ( XXXX XXXX XXXX, XXXX, TX ). I asked the doctor to turn over my claim to his malpractice insurance company. This has cost me thousands of dollars to XXXX XXXX XXXX. I am still trying to collect damages from this doctor. He never responded and turned over me to collections Merchants and Professional Collection Bureau , Inc. I sent them a letter describing exactly this issue and instead of not contacting me and verifying my debt they start reporting this debt to the credit reporting agencies. They never verified the debt, like I asked and they never stopped it from being reported when I specifically told them not to, due to the circumstances above.",Debt collection,"Merchants and Professional Bureau, Inc."
...,...,...,...
57448,"I am attempting to make a payment toward my student loans on the Nelnet website today, XX/XX/20, and Nelnet will not allow me to post the payment sooner than XX/XX/20. By the time the payment posts, 2-3 days of additional interest will have accrued and my payments will apply more to interest than is due today, the day that I'm attempting to pay. My understanding was that I could make a payment at any time but this does not appear to be true. The funds are available in my bank account today regardless of whether Nelnet can collect over the weekend. I should not be penalized for this. \n\nI submitted complaint XXXX in XXXX for other deceptive practices with Nelnet. They have not yet resolved the issue identified in that complaint or contacted me as they said they would in their response. I believe this new issue is just one more deceptive practice by this company that causes financial harm to borrowers.",Student loan,"Nelnet, Inc."
57449,Received letter for {$480.00}. Original creditor didnt contact me until past statute of limitations for insurance company recoupment per Arizona law. Debt collection is illegal for phantom debt. Additionally they are phoning my office excessively.,Debt collection,"The Receivable Management Services LLC, New York, NY Branch"
57450,"entire time 10 years until XX/XX/2020. XXXX makes my blood boil. I have called and was lied to told to provide my checking account information over the phone in order to turn my cell phone back on. i called at XXXX them at XXXX {$300.00} was added to my bill. \n\nScam scam scam I was told I can not call the office of the President just to write to XXXX XXXX XXXX XXXX XXXX XXXX XXXX, NM XXXX. I did three thousand times. the last letter I mailed on XX/XX/2020. Two collection agencies later. \n\nI chose to leave XXXX XXXX every time I called the XXXX supervisor would threaten me on a recorded line. I need peace of mind and a good Heart to beat inside of me. Im on a XXXX XXXX due to the stress at XXXX XXXX taking all my money 4 10 years.",Debt collection,"Convergent Resources, Inc."
57451,"I am a customer with Wells Fargo Bank. Recently money was withdrawn on a couple of occasions without my permission or consent to pay for a timeshare account that was never used by me nor anyone connected to me because of unfair policies pertaining to the fees of the said timeshare. I tried cancelling the said timeshare account several times because of these fees that were never mentioned at the initiation. My account was debited to pay for the timeshare fees without my knowledge or consent several times. I tried correcting this with Wells Fargo bank with no avail. I would appreciate it if you can look into this matter for me. I was left with no funds in my account and as such I could not take care of the basic necessities of my day to day life. \nThanks in advance,",Checking or savings account,WELLS FARGO & COMPANY


In [77]:
pd.set_option('display.max_colwidth',-1)
complaints_df

Unnamed: 0,complaints,Product,Company
0,"This auto loan was opened on XX/XX/2020 in XXXX, NC with BB & T in my name. I have NEVER been to North Carolina and I have NEVER been a resident. I have filed a dispute twice through my credit bureaus but both times BB & T has claimed that this is an accurate loan. Which I wasn't aware of until today. I have tried to contact BB & T multiple times but I have never gotten through to a live person. I do n't drive and I have never owned a car before. I didn't have any knowledge of this account until I checked XXXXXXXX XXXX and noticed it. I've tried twice to dispute it. Additionally I never received any bills or information about this account. This is my last resort in trying to remove this fraudulent loan off of my account.",Vehicle loan or lease,TRUIST FINANCIAL CORPORATION
1,"In XXXX of 2019 I noticed a debt for {$620.00} on my credit which i believed was mine I thought speedy cash had bought one of my old debts and sold it to XXXX XXXX XXXX XXXX. I contacted XXXX XXXX XXXX XXXX and after several attempts of giving my full name, nothing came up in their system. I gave my social and the rep said the account popped up but DID NOT tell me that the account was under someone elses name and continued to let me make a payment. The payment was for {$120.00}. Confirmation number-XXXX. After realizing it was not my account, I called back to get my money back and inform them of the mistake. I was told i needed to mail them an FTC report and dispute letter to get my money back. I completed all of this and when i called again they said they transferred the account back to speedy cash for fraud review and I would need to contact them. After contacting them i was again told that i can not get my money back. The issue im having is this representative at XXXX XXXX played blind to obvious fraud and let an innocent person make a payment on someone elses debt and i want my money back.",Debt collection,CURO Intermediate Holdings
2,"As stated from Capital One, XXXX XX/XX/XXXX and XXXX 2018, My wife and I went to several car dealerships to request for a car loan to get a used car. However, according to their credit requirements unfortunately my credit score was insufficient for the car loan approval at that time. It seemed as though they pulled my credit report multiple times.",Vehicle loan or lease,CAPITAL ONE FINANCIAL CORPORATION
3,"Please see CFPB case XXXX. \n\nCapital One, in the letter they provided ( and attached to that case as their response ) said this : "" The funds were reversed and sent back to XXXX XXXX XXXX on XX/XX/XXXX ''. \n\nXXXX XXXX XXXX ( now XXXX XXXX ) has not received these funds. Staff at XXXX XXXX - and also staff at the account-holder 's business - have looked for return of my money ( {$650.00} ) and find nothing. \n\nCapital One needs to document - actually prove - they returned the funds, as stated in their letter. Capital One must provide electronic information, if the return was made that way, or document the paper check they sent back to XXXX XXXX. \n\nI've left 3 messages about this problem for the person who signed the letter ( XXXX ) from Capital One. I have received no call-backs. \n\nSummary : Capital One said they returned my money on XX/XX/XXXX : they did not. If they continue claim they did, then they need to prove that.",Checking or savings account,CAPITAL ONE FINANCIAL CORPORATION
4,"This debt was incurred due to medical malpractice ( XXXX XXXX XXXX, XXXX, TX ). I asked the doctor to turn over my claim to his malpractice insurance company. This has cost me thousands of dollars to XXXX XXXX XXXX. I am still trying to collect damages from this doctor. He never responded and turned over me to collections Merchants and Professional Collection Bureau , Inc. I sent them a letter describing exactly this issue and instead of not contacting me and verifying my debt they start reporting this debt to the credit reporting agencies. They never verified the debt, like I asked and they never stopped it from being reported when I specifically told them not to, due to the circumstances above.",Debt collection,"Merchants and Professional Bureau, Inc."
...,...,...,...
57448,"I am attempting to make a payment toward my student loans on the Nelnet website today, XX/XX/20, and Nelnet will not allow me to post the payment sooner than XX/XX/20. By the time the payment posts, 2-3 days of additional interest will have accrued and my payments will apply more to interest than is due today, the day that I'm attempting to pay. My understanding was that I could make a payment at any time but this does not appear to be true. The funds are available in my bank account today regardless of whether Nelnet can collect over the weekend. I should not be penalized for this. \n\nI submitted complaint XXXX in XXXX for other deceptive practices with Nelnet. They have not yet resolved the issue identified in that complaint or contacted me as they said they would in their response. I believe this new issue is just one more deceptive practice by this company that causes financial harm to borrowers.",Student loan,"Nelnet, Inc."
57449,Received letter for {$480.00}. Original creditor didnt contact me until past statute of limitations for insurance company recoupment per Arizona law. Debt collection is illegal for phantom debt. Additionally they are phoning my office excessively.,Debt collection,"The Receivable Management Services LLC, New York, NY Branch"
57450,"entire time 10 years until XX/XX/2020. XXXX makes my blood boil. I have called and was lied to told to provide my checking account information over the phone in order to turn my cell phone back on. i called at XXXX them at XXXX {$300.00} was added to my bill. \n\nScam scam scam I was told I can not call the office of the President just to write to XXXX XXXX XXXX XXXX XXXX XXXX XXXX, NM XXXX. I did three thousand times. the last letter I mailed on XX/XX/2020. Two collection agencies later. \n\nI chose to leave XXXX XXXX every time I called the XXXX supervisor would threaten me on a recorded line. I need peace of mind and a good Heart to beat inside of me. Im on a XXXX XXXX due to the stress at XXXX XXXX taking all my money 4 10 years.",Debt collection,"Convergent Resources, Inc."
57451,"I am a customer with Wells Fargo Bank. Recently money was withdrawn on a couple of occasions without my permission or consent to pay for a timeshare account that was never used by me nor anyone connected to me because of unfair policies pertaining to the fees of the said timeshare. I tried cancelling the said timeshare account several times because of these fees that were never mentioned at the initiation. My account was debited to pay for the timeshare fees without my knowledge or consent several times. I tried correcting this with Wells Fargo bank with no avail. I would appreciate it if you can look into this matter for me. I was left with no funds in my account and as such I could not take care of the basic necessities of my day to day life. \nThanks in advance,",Checking or savings account,WELLS FARGO & COMPANY


In [78]:
X_train ,X_hold = train_test_split(complaints_df,test_size = 0.6,random_state = 1)

In [79]:
X_train['Product'].value_counts()

Debt collection                8790
Credit card or prepaid card    5223
Mortgage                       3900
Checking or savings account    2816
Student loan                   1189
Vehicle loan or lease          1063
Name: Product, dtype: int64

In [80]:
X_train.shape,X_hold.shape

((22981, 3), (34472, 3))

In [81]:
stemmer = PorterStemmer()

In [82]:
def tokenize(text):
    tokens = [word for word in nltk.word_tokenize(text) if (len(word) > 3 and len(word.strip('Xx/')) > 2) ] 
    #stems = [stemmer.stem(item) for item in tokens]
    return tokens

In [83]:
tfidf_vect = TfidfVectorizer(tokenizer=tokenize,stop_words = "english",max_df = 0.75,min_df = 50,max_features = 10000,use_idf = False , norm = None)
tfidf_vect

TfidfVectorizer(max_df=0.75, max_features=10000, min_df=50, norm=None,
                stop_words='english',
                tokenizer=<function tokenize at 0x00000009748A93A8>,
                use_idf=False)

In [84]:
tf_vectors = tfidf_vect.fit_transform(X_train.complaints)

In [85]:
tf_vectors.A


array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]])

In [86]:
tfidf_vect.get_feature_names()

['0.00',
 '1.00',
 '10.00',
 '100.00',
 '1000.00',
 '10000.00',
 '100000.00',
 '110.00',
 '1100.00',
 '11000.00',
 '12.00',
 '120.00',
 '1200.00',
 '12000.00',
 '130.00',
 '1300.00',
 '13000.00',
 '140.00',
 '1400.00',
 '14000.00',
 '15.00',
 '150.00',
 '1500.00',
 '15000.00',
 '160.00',
 '1600.00',
 '1681c-2',
 '1681m',
 '1692',
 '1692g',
 '170.00',
 '1700.00',
 '180.00',
 '1800.00',
 '18000.00',
 '190.00',
 '1900.00',
 '2.00',
 '20.00',
 '200.00',
 '2000.00',
 '20000.00',
 '2016',
 '2017',
 '2018',
 '2019',
 '2019.',
 '2020',
 '210.00',
 '2100.00',
 '220.00',
 '2200.00',
 '230.00',
 '2300.00',
 '24-48',
 '240.00',
 '2400.00',
 '25.00',
 '250.00',
 '2500.00',
 '25000.00',
 '260.00',
 '2600.00',
 '270.00',
 '2700.00',
 '28.00',
 '280.00',
 '2800.00',
 '29.00',
 '290.00',
 '2900.00',
 '3.00',
 '30.00',
 '300.00',
 '3000.00',
 '30000.00',
 '310.00',
 '3100.00',
 '320.00',
 '3200.00',
 '330.00',
 '3300.00',
 '34.00',
 '340.00',
 '35.00',
 '350.00',
 '3500.00',
 '36.00',
 '360.00',
 '3600.

In [87]:
# n_components : it tells us in how many topics our text in a document is going to be classified
# max_iter : it tells us how many times the it will iterate (i.e replacing old one by new one)
# learning_method : it runs mini batches of data and replaces old with new topics
# learning_offset : how much time do you want to wait for early iterations
lda = decomposition.LatentDirichletAllocation(n_components=6,learning_method = 'online',max_iter=3,learning_offset=50,n_jobs=-1,random_state=42)

w1 = lda.fit_transform(tf_vectors)
H1 = lda.components_

In [88]:
w1

array([[0.00372694, 0.00372995, 0.65141969, 0.33361666, 0.0037704 ,
        0.00373636],
       [0.97670543, 0.00464765, 0.00466262, 0.00465201, 0.00465337,
        0.00467893],
       [0.0011038 , 0.00110356, 0.13212289, 0.46887483, 0.39569098,
        0.00110393],
       ...,
       [0.0046509 , 0.00468348, 0.00466674, 0.97667567, 0.00467518,
        0.00464804],
       [0.00696273, 0.00699808, 0.00697056, 0.00702475, 0.85605544,
        0.11598843],
       [0.00420032, 0.00418469, 0.00421817, 0.56073177, 0.42246111,
        0.00420394]])

In [89]:
H1

array([[3.98675855e+01, 9.52811970e+00, 2.03818932e+01, ...,
        5.64898983e+01, 1.88148463e+00, 4.19051296e+01],
       [4.73325494e+02, 1.89234348e+02, 1.75800745e+02, ...,
        1.69997383e-01, 1.95837659e-01, 1.33536895e+02],
       [1.68784951e-01, 1.73669193e-01, 1.68290648e-01, ...,
        3.40769280e+01, 1.70989035e-01, 3.67652617e+01],
       [6.56921734e+01, 1.67670931e-01, 3.68856966e+00, ...,
        1.69209000e-01, 2.50672929e+01, 2.52108575e+01],
       [3.83508854e+00, 1.84282068e-01, 8.49933041e+00, ...,
        9.30609655e+00, 8.51320736e+01, 1.00301323e+02],
       [8.54477053e+00, 1.67863547e-01, 1.67215466e-01, ...,
        3.28726777e+01, 1.67344199e-01, 1.54947877e+01]])

In [90]:
num_words = 15

vocab = np.array(tfidf_vect.get_feature_names())
#vocab

#peeking top 20 words from thegiven document
top_words = lambda t: [vocab[i] for i in np.argsort(t)[:-num_words-1:-1]]
topic_words = ([top_words(t) for t in H1])
topics = [' '.join(t) for t in topic_words]

In [91]:
topics

['account card bank credit check money chase fraud closed received customer dispute funds told service',
 'payment balance account credit fees paid xx/xx/2019 statement late received charged date charge payments charges',
 'loan mortgage home property documents received letter modification sent application process request closing foreclosure information',
 'payment payments loan insurance paid mortgage company loans month time years account escrow late make',
 'told called said phone asked number time company just received know calls credit spoke times',
 'debt credit account report collection company information reporting letter agency sent provide original received proof']

In [92]:
colnames = ["Topic "+str(i) for i in range(lda.n_components)] # since lda.n_components are in array format , so no need to take length of n_components
docnames = ["Doc "+ str(i)  for i in range(len(X_train.complaints))]
df_doc_topic = pd.DataFrame(np.round(w1, 2), columns=colnames, index=docnames)
significant_topics = np.argmax(df_doc_topic.values,axis=1)
df_doc_topic['Dominant topic'] = significant_topics

### apply LDA on test dataset

In [93]:
Whold= lda.transform(tfidf_vect.transform(X_hold.complaints[:10]))

In [94]:
colnames = ["Topic "+str(i) for i in range(lda.n_components)] # since lda.n_components are in array format , so no need to take length of n_components
docnames = ["Doc "+ str(i)  for i in range(len(X_hold.complaints[:10]))]
df_doc_topic = pd.DataFrame(np.round(Whold, 2), columns=colnames, index=docnames)
significant_topics = np.argmax(df_doc_topic.values,axis=1)
df_doc_topic['Dominant topic'] = significant_topics

In [95]:
df_doc_topic

Unnamed: 0,Topic 0,Topic 1,Topic 2,Topic 3,Topic 4,Topic 5,Dominant topic
Doc 0,0.51,0.12,0.0,0.1,0.0,0.27,0
Doc 1,0.27,0.0,0.37,0.35,0.0,0.0,2
Doc 2,0.1,0.04,0.0,0.22,0.62,0.02,4
Doc 3,0.01,0.01,0.01,0.63,0.14,0.2,3
Doc 4,0.39,0.04,0.0,0.0,0.38,0.19,0
Doc 5,0.0,0.0,0.0,0.0,0.4,0.58,5
Doc 6,0.14,0.11,0.16,0.29,0.24,0.07,3
Doc 7,0.0,0.0,0.03,0.53,0.24,0.19,3
Doc 8,0.01,0.01,0.01,0.74,0.24,0.01,3
Doc 9,0.87,0.0,0.0,0.12,0.0,0.0,0
