# Start of Project 1
This project's goal was to find how much one field influences another. To do this we pulled all the papers posted on the Arxiv. For each paper we pulled the doi, date, title, authors, and category(field). This is shown below.

In [1]:
from setup import *

In [2]:
df_raw = pd.read_csv('data_v3.csv')

In [3]:
df_raw

Unnamed: 0.1,Unnamed: 0,doi,date,title,authors,category
0,0,oai:arXiv.org:0704.0002,2007-03-30,Sparsity-certifying Graph Decompositions,Streinu Ileana;Theran Louis,cs
1,1,oai:arXiv.org:0704.0046,2007-04-01,A limit relation for entropy and channel capac...,Csiszar I.;Hiai F.;Petz D.,cs
2,2,oai:arXiv.org:0704.0047,2007-04-01,Intelligent location of simultaneously active ...,Kosel T.;Grabec I.,cs
3,3,oai:arXiv.org:0704.0050,2007-04-01,Intelligent location of simultaneously active ...,Kosel T.;Grabec I.,cs
4,4,oai:arXiv.org:0704.0062,2007-03-31,On-line Viterbi Algorithm and Its Relationship...,Šrámek Rastislav;Brejová Broňa;Vinař Tomáš,cs
5,5,oai:arXiv.org:0704.0090,2007-04-01,Real Options for Project Schedules (ROPS),Ingber Lester,cs
6,6,oai:arXiv.org:0704.0098,2007-04-01,Sparsely-spread CDMA - a statistical mechanics...,Raymond Jack;Saad David,cs
7,7,oai:arXiv.org:0704.0108,2007-04-01,Reducing SAT to 2-SAT,Gubin Sergey,cs
8,8,oai:arXiv.org:0704.0213,2007-04-02,Geometric Complexity Theory V: On deciding non...,Narayanan Ketan D. Mulmuley Hariharan,cs
9,9,oai:arXiv.org:0704.0217,2007-04-02,Capacity of a Multiple-Antenna Fading Channel ...,Santipach Wiroonsak;Honig Michael L.,cs


# Drop duplicate/NaN titles
When we scraped the data we found that everytime an author made a revision to their paper, a copy of it would be created on the Arxiv. To avoid counting a single paper multiple times, I dropped the duplicate papers based on the titles. This reduced our data from 2 million papers to 1.3 million unique papers.

In [4]:
df_raw.drop_duplicates('title',keep = 'last',inplace = True)
len(df_raw)

1319570

In [5]:
df_raw['title'].dropna(axis = 0, inplace = True)
len(df_raw)

1319570

# Split authors by ";"
Since the authors were all placed together in a single column, individual authors would have been difficult to extract. The authors however, were separated by semicolons so I created a new column where the authors were separated to be read more easily.

In [6]:
df_raw['author_split']  = df_raw['authors'].str.split(';')

In [9]:
df_raw['authors'].fillna("Ridiculousnamethatsjoenoname",inplace=True)
df_raw =df_raw[df_raw['authors'].str.contains('Ridiculousnamethatsjoenoname')==False]

In [7]:
df_raw

Unnamed: 0.1,Unnamed: 0,doi,date,title,authors,category,author_split
2,2,oai:arXiv.org:0704.0047,2007-04-01,Intelligent location of simultaneously active ...,Kosel T.;Grabec I.,cs,"[Kosel T., Grabec I.]"
3,3,oai:arXiv.org:0704.0050,2007-04-01,Intelligent location of simultaneously active ...,Kosel T.;Grabec I.,cs,"[Kosel T., Grabec I.]"
4,4,oai:arXiv.org:0704.0062,2007-03-31,On-line Viterbi Algorithm and Its Relationship...,Šrámek Rastislav;Brejová Broňa;Vinař Tomáš,cs,"[Šrámek Rastislav, Brejová Broňa, Vinař Tomáš]"
7,7,oai:arXiv.org:0704.0108,2007-04-01,Reducing SAT to 2-SAT,Gubin Sergey,cs,[Gubin Sergey]
8,8,oai:arXiv.org:0704.0213,2007-04-02,Geometric Complexity Theory V: On deciding non...,Narayanan Ketan D. Mulmuley Hariharan,cs,[Narayanan Ketan D. Mulmuley Hariharan]
10,10,oai:arXiv.org:0704.0218,2007-04-02,On Almost Periodicity Criteria for Morphic Seq...,Pritykin Yuri,cs,[Pritykin Yuri]
11,11,oai:arXiv.org:0704.0229,2007-04-02,Geometric Complexity Theory VI: the flip via s...,Mulmuley Ketan D.,cs,[Mulmuley Ketan D.]
13,13,oai:arXiv.org:0704.0301,2007-04-03,Differential Recursion and Differentially Alge...,Kawamura Akitoshi,cs,[Kawamura Akitoshi]
15,15,oai:arXiv.org:0704.0309,2007-04-02,The Complexity of HCP in Digraps with Degree B...,Zhu Guohun,cs,[Zhu Guohun]
17,17,oai:arXiv.org:0704.0468,2007-04-03,Inapproximability of Maximum Weighted Edge Bic...,Tan Jinsong,cs,[Tan Jinsong]


# Credit Matrix
The first step in buiding a field-field influence matrix was to build a credit matrix, $\textbf{C}_{ij}$. To construct this, each row of data extracted the unique authors from the data and were used as the rows for the credit matrix, $i$. Similarly, the fields were extracted and used as the columns for the credit matrix, $j$.
To build the credit matrix, each paper was pulled into a function where the authors and category were read in. Each paper was assigned to be worth one credit. Authors, $i$, received a portion of the credit, $\frac{1}{number\, of\, authors}$, and given credit in the field, $j$, that the paper was written.

In [10]:
authors = list(set(auth for coauths in df_raw['author_split'] for auth in coauths))
authors.remove('')
fields = df_raw['category'].unique()

Credit_mat  = pd.DataFrame(0.0,index=authors,columns=fields)
# Credit_mat #This is a dataframe matrix of size author x fields filled with zeros
for (coauths,field) in df_raw[['author_split','category']].values:#.tolist():
    if len(coauths) > 0:
        cred = 1/len(coauths)
        for auth in coauths:
            if(len(auth)>0):
                Credit_mat.loc[auth,field] += cred
    else:
        continue

In [11]:
Credit_mat

Unnamed: 0,cs,econ,eess,math,physics:astro-ph,physics:cond-mat,physics:gr-qc,physics:hep-ex,physics:hep-lat,physics:hep-ph,physics:hep-th,physics:math-ph,physics:nlin,physics:nucl-ex,physics:nucl-th,physics:physics,physics:quant-ph,q-bio,q-fin,stat
Sidiropoulos G.,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.013,0.000,0.000,0.000,0.000
Schleier-Smith Monika,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,1.061,0.000,0.000,0.000
Alecu A.,0.000,0.000,0.000,0.000,0.014,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Holtman Koen,1.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Kopf Johannes,0.417,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Chagelishvili G. D.,0.000,0.000,0.000,0.000,2.417,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.783,0.000,0.000,0.000,0.000
Liu Gang-Qin,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,1.446,0.000,0.000,0.000
Trunov A.,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Parsaei Foad,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.333,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
"Milgrom Mordehai Dept. of Condensed-Matter Physics, Weizmann Institute",0.000,0.000,0.000,0.000,1.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000


# Test for Non-authors, Sort by highest credits
After building the credit matrix, the authors were sorted by the highest amount of credits. This was done to visually scan for any non-authors that should be taken out of the dataset.


In [13]:
Test_sum = np.asarray(Credit_mat)
Test_sum = Test_sum.sum(axis =1, keepdims = True)
Test_sum = pd.DataFrame(Test_sum, index = authors)
Test_sum.sort_values(by = [0],ascending = False).head(25)

Unnamed: 0,0
CMS Collaboration,675.0
ATLAS Collaboration,635.5
Shelah Saharon,438.317
Aubert B.,343.084
The BABAR Collaboration,276.182
D0 Collaboration,254.274
Iorio Lorenzo,208.917
Sarma S. Das,199.245
CDF Collaboration,195.647
ALICE Collaboration,187.001


# Drop Non-authors
By sorting the credits, it could be seen that the raw data contained some invalid authors that could highly influence and distort the final matrix. Therefore, in order to appropraitely model Field-Field Influence, rows(papers) containing these authors were removed from the testing data.

In [14]:
df_raw =df_raw[df_raw['authors'].str.contains('CMS Collaboration')==False]
df_raw =df_raw[df_raw['authors'].str.contains('ATLAS Collaboration')==False]
df_raw =df_raw[df_raw['authors'].str.contains('The BABAR Collaboration')==False]
df_raw =df_raw[df_raw['authors'].str.contains('D0 Collaboration')==False]
df_raw =df_raw[df_raw['authors'].str.contains('CDF Collaboration')==False]
df_raw =df_raw[df_raw['authors'].str.contains('ZEUS Collaboration')==False]
df_raw =df_raw[df_raw['authors'].str.contains('ALICE Collaboration')==False]
df_raw =df_raw[df_raw['authors'].str.contains('Ma Ernest UC Riverside')==False]
df_raw =df_raw[df_raw['authors'].str.contains('Collaboration for the ALICE')==False]
len(df_raw)

1315478

# Finding Scott Cook and Bryant Wyatt

In [15]:
df_raw.loc[982225]['authors']

'Wyatt Bryant M.;Petz Jonathan M.;Sumpter William J.;Turner Ty R.;Smith Edward L.;Fain Baylor G.;Hutyra Taylor J.;Cook Scott A.;Hibbs Michael F.;Goderya Shaukat N.'

# Split authors by ";"
After the invalid authors were dropped the authors needed to be resplit.

In [16]:
df_raw['author_split']  = df_raw['authors'].str.split(';')

In [17]:
df_raw

Unnamed: 0.1,Unnamed: 0,doi,date,title,authors,category,author_split
2,2,oai:arXiv.org:0704.0047,2007-04-01,Intelligent location of simultaneously active ...,Kosel T.;Grabec I.,cs,"[Kosel T., Grabec I.]"
3,3,oai:arXiv.org:0704.0050,2007-04-01,Intelligent location of simultaneously active ...,Kosel T.;Grabec I.,cs,"[Kosel T., Grabec I.]"
4,4,oai:arXiv.org:0704.0062,2007-03-31,On-line Viterbi Algorithm and Its Relationship...,Šrámek Rastislav;Brejová Broňa;Vinař Tomáš,cs,"[Šrámek Rastislav, Brejová Broňa, Vinař Tomáš]"
7,7,oai:arXiv.org:0704.0108,2007-04-01,Reducing SAT to 2-SAT,Gubin Sergey,cs,[Gubin Sergey]
8,8,oai:arXiv.org:0704.0213,2007-04-02,Geometric Complexity Theory V: On deciding non...,Narayanan Ketan D. Mulmuley Hariharan,cs,[Narayanan Ketan D. Mulmuley Hariharan]
10,10,oai:arXiv.org:0704.0218,2007-04-02,On Almost Periodicity Criteria for Morphic Seq...,Pritykin Yuri,cs,[Pritykin Yuri]
11,11,oai:arXiv.org:0704.0229,2007-04-02,Geometric Complexity Theory VI: the flip via s...,Mulmuley Ketan D.,cs,[Mulmuley Ketan D.]
13,13,oai:arXiv.org:0704.0301,2007-04-03,Differential Recursion and Differentially Alge...,Kawamura Akitoshi,cs,[Kawamura Akitoshi]
15,15,oai:arXiv.org:0704.0309,2007-04-02,The Complexity of HCP in Digraps with Degree B...,Zhu Guohun,cs,[Zhu Guohun]
17,17,oai:arXiv.org:0704.0468,2007-04-03,Inapproximability of Maximum Weighted Edge Bic...,Tan Jinsong,cs,[Tan Jinsong]


# Credit Matrix
The Credit Matrix was constructed again with the refined data.

In [18]:
authors = list(set(auth for coauths in df_raw['author_split'] for auth in coauths))
authors.remove('')
fields = df_raw['category'].unique()

Credit_mat  = pd.DataFrame(0.0,index=authors,columns=fields)
# Credit_mat #This is a dataframe matrix of size author x fields filled with zeros
for (coauths,field) in df_raw[['author_split','category']].values:#.tolist():
    if len(coauths) > 0:
        cred = 1/len(coauths)
        for auth in coauths:
            if(len(auth)>0):
                Credit_mat.loc[auth,field] += cred
    else:
        continue

In [19]:
Credit_mat

Unnamed: 0,cs,econ,eess,math,physics:astro-ph,physics:cond-mat,physics:gr-qc,physics:hep-ex,physics:hep-lat,physics:hep-ph,physics:hep-th,physics:math-ph,physics:nlin,physics:nucl-ex,physics:nucl-th,physics:physics,physics:quant-ph,q-bio,q-fin,stat
Sidiropoulos G.,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.013,0.000,0.000,0.000,0.000
Schleier-Smith Monika,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,1.061,0.000,0.000,0.000
Alecu A.,0.000,0.000,0.000,0.000,0.014,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Holtman Koen,1.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Kopf Johannes,0.417,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Chagelishvili G. D.,0.000,0.000,0.000,0.000,2.417,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.783,0.000,0.000,0.000,0.000
Liu Gang-Qin,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,1.446,0.000,0.000,0.000
Trunov A.,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Parsaei Foad,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.333,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
"Milgrom Mordehai Dept. of Condensed-Matter Physics, Weizmann Institute",0.000,0.000,0.000,0.000,1.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000


# Test for Non-authors, sort by highest credits
The Credit Matrix was sorted again for any other invalid authors


In [20]:
Test_sum = np.asarray(Credit_mat)
Test_sum = Test_sum.sum(axis =1, keepdims = True)
Test_sum = pd.DataFrame(Test_sum, index = authors)
Test_sum.sort_values(by = [0],ascending = False).head(20)

Unnamed: 0,0
Shelah Saharon,438.317
Iorio Lorenzo,208.917
Sarma S. Das,199.245
Ramm A. G.,170.167
Tao Terence,159.5
Znojil Miloslav,151.683
Sen Ashoke,150.283
Meißner Ulf-G.,150.092
Volovik G. E.,149.077
Leydesdorff Loet,148.843


# Author Activity Matrix
Once the final credit matrix was constructed, an Author Activity Matrix needed to be made. This matrix, $\textbf{A}_{ij}$ was constructed by taking the amount of credits an author had in each field and dividing it by the amount of total credits the author had. This gave the proportion of papers each author had in a field.

In [21]:
A_mat = np.asarray(Credit_mat)
Matrix_As = A_mat/A_mat.sum(axis=1,keepdims = True)
Matrix_A = pd.DataFrame(Matrix_As, index = authors, columns = fields)
Matrix_A

Unnamed: 0,cs,econ,eess,math,physics:astro-ph,physics:cond-mat,physics:gr-qc,physics:hep-ex,physics:hep-lat,physics:hep-ph,physics:hep-th,physics:math-ph,physics:nlin,physics:nucl-ex,physics:nucl-th,physics:physics,physics:quant-ph,q-bio,q-fin,stat
Sidiropoulos G.,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,1.000,0.000,0.000,0.000,0.000
Schleier-Smith Monika,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,1.000,0.000,0.000,0.000
Alecu A.,0.000,0.000,0.000,0.000,1.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Holtman Koen,1.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Kopf Johannes,1.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Chagelishvili G. D.,0.000,0.000,0.000,0.000,0.755,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.245,0.000,0.000,0.000,0.000
Liu Gang-Qin,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,1.000,0.000,0.000,0.000
Trunov A.,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,1.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Parsaei Foad,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,1.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
"Milgrom Mordehai Dept. of Condensed-Matter Physics, Weizmann Institute",0.000,0.000,0.000,0.000,1.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000


# Author Weight Matrix
An Author Weight Matrix, $\textbf{W}_{ij}$ also had to be constructed. In this case, the credits an author had in one field was divided by the total amount of credits the respective field had. This returned the proportion each author contributed to a certain field. These values were very close to 0 because each field could have been potentially contributed to by 1.1 million authors. 

In [22]:
Matrix_Ws = A_mat/A_mat.sum(axis=0,keepdims = True)
Matrix_W = pd.DataFrame(Matrix_Ws,index = authors, columns =fields)
Matrix_W

Unnamed: 0,cs,econ,eess,math,physics:astro-ph,physics:cond-mat,physics:gr-qc,physics:hep-ex,physics:hep-lat,physics:hep-ph,physics:hep-th,physics:math-ph,physics:nlin,physics:nucl-ex,physics:nucl-th,physics:physics,physics:quant-ph,q-bio,q-fin,stat
Sidiropoulos G.,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Schleier-Smith Monika,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Alecu A.,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Holtman Koen,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Kopf Johannes,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Chagelishvili G. D.,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Liu Gang-Qin,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Trunov A.,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
Parsaei Foad,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
"Milgrom Mordehai Dept. of Condensed-Matter Physics, Weizmann Institute",0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000


# Field-Field Influence Matrix
Finally a Field-Field Inluence Matrix, $\textbf{I}_{jj}$ could be constructed by multiplying $\textbf{A}_{ij} \times \textbf{W}_{ij}$ but the matrices how they were could not be multiplied since the dimensions were not lined up. So I transposed the A matrix to have, $\textbf{A}^{T}_{ji}$. This way $\textbf{A}^{T}_{ji}\times \textbf{W}_{ij}$ would result in a $j \times j$ matrix $\textbf{I}_{jj}$ where $j$ represents fields.

In [23]:
Matrix_I = Matrix_A.T.dot(Matrix_W)

In [24]:
Matrix_I

Unnamed: 0,cs,econ,eess,math,physics:astro-ph,physics:cond-mat,physics:gr-qc,physics:hep-ex,physics:hep-lat,physics:hep-ph,physics:hep-th,physics:math-ph,physics:nlin,physics:nucl-ex,physics:nucl-th,physics:physics,physics:quant-ph,q-bio,q-fin,stat
cs,0.845,0.039,0.188,0.022,0.002,0.005,0.002,0.002,0.002,0.002,0.002,0.004,0.014,0.003,0.002,0.017,0.008,0.031,0.023,0.081
econ,0.0,0.725,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
eess,0.001,0.0,0.533,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
math,0.065,0.027,0.143,0.896,0.003,0.009,0.015,0.003,0.004,0.004,0.017,0.185,0.074,0.004,0.004,0.022,0.02,0.044,0.106,0.093
physics:astro-ph,0.004,0.0,0.007,0.002,0.922,0.006,0.073,0.026,0.005,0.027,0.015,0.005,0.013,0.021,0.017,0.034,0.006,0.006,0.004,0.008
physics:cond-mat,0.009,0.0,0.024,0.006,0.006,0.836,0.011,0.01,0.036,0.008,0.02,0.039,0.092,0.017,0.019,0.09,0.085,0.101,0.055,0.012
physics:gr-qc,0.001,0.0,0.0,0.002,0.013,0.002,0.625,0.002,0.003,0.007,0.045,0.023,0.007,0.001,0.003,0.013,0.011,0.003,0.003,0.001
physics:hep-ex,0.0,0.0,0.0,0.0,0.002,0.001,0.001,0.791,0.001,0.009,0.0,0.0,0.0,0.037,0.002,0.006,0.0,0.0,0.0,0.0
physics:hep-lat,0.0,0.0,0.0,0.0,0.0,0.002,0.001,0.001,0.629,0.014,0.01,0.002,0.001,0.002,0.013,0.001,0.001,0.001,0.002,0.0
physics:hep-ph,0.002,0.0,0.002,0.001,0.013,0.004,0.021,0.067,0.129,0.769,0.066,0.008,0.004,0.045,0.122,0.015,0.01,0.004,0.005,0.002


In [25]:
Matrix_I.to_html('Field-Field_Influence.html')

# Fin