In [1]:
from IPython.core.display import HTML
display(HTML("<style>.container { width:95% !important; }</style>"))

In [12]:
import faiss
import pandas as pd
import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
from transformers import AutoTokenizer, TFAutoModel

In [15]:
universal_sentence_encoder = hub.load("https://tfhub.dev/google/universal-sentence-encoder/4")
tokenizer = AutoTokenizer.from_pretrained('allenai/specter')
model = TFAutoModel.from_pretrained('allenai/specter', from_pt=True)

Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFBertModel: ['embeddings.position_ids']
- This IS expected if you are initializing TFBertModel from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFBertModel from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
All the weights of TFBertModel were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.


In [27]:
specter = np.load('../data/spectre_embeddings.npy')
abstracts =  np.load('../data/use_abstract_embeddings.npy')
article_texts =  np.load('../data/use_article_embeddings.npy')
articles = pd.read_pickle('../data/to_index_p4.pkl')
articles.dropna(axis=0, inplace=True)
articles.reset_index(inplace=True)
articles

Unnamed: 0,index,id,title,abstract,text
0,0,P0,Abduction,"In the philosophical literature, the term “abd...",\n1. Abduction: The General Idea\n\nYou happen...
1,1,P1,Affirmative Action,“Affirmative action” means positive steps take...,"\n1. In the Beginning\n\n\nIn 1972, affirmativ..."
2,2,P2,Aesthetics of the Everyday,"In the history of Western aesthetics, the subj...",\n1. Recent History\n\nWith the establishment ...
3,3,P3,Wittgenstein’s Aesthetics,Given the extreme importance that Wittgenstein...,\n1. The Critique of Traditional Aesthetics\n\...
4,4,P4,Schopenhauer’s Aesthetics,The focus of this entry is on Schopenhauer’s a...,"\n1. Brief Background\n\n\nBy the 1870s, Arthu..."
...,...,...,...,...,...
7216,6084,W6084,Stanisław Krajewski,Stanisław Krajewski (born 1950) is a Polish ph...,Stanisław Krajewski (born 1950) is a Polish ph...
7217,6085,W6085,Patrick Stokes (philosopher),Patrick Stokes (born 1978) is an Australian ph...,Patrick Stokes (born 1978) is an Australian ph...
7218,6086,W6086,Ernst Mach,Ernst Waldfried Josef Wenzel Mach (; German: [...,Ernst Waldfried Josef Wenzel Mach (; German: [...
7219,6087,W6087,Jessica Pierce,"Jessica Pierce (born October 21, 1965) is an A...","Jessica Pierce (born October 21, 1965) is an A..."


In [37]:
specter_index = faiss.IndexFlatIP(specter.shape[-1])
abstracts_index = faiss.IndexFlatIP(abstracts.shape[-1])
article_texts_index = faiss.IndexFlatIP(article_texts.shape[-1])

In [38]:
%%time
specter_index.add(specter)
abstracts_index.add(abstracts)
article_texts_index.add(article_texts)

CPU times: user 12.2 ms, sys: 12.1 ms, total: 24.3 ms
Wall time: 22.3 ms


In [61]:
%%time
query = """If a person's head moves, she may or may not have moved her head, and, if she did move it, 
        she may have actively performed the movement of her head or merely, by doing something else, 
        caused a passive movement."""
query_use_embedding = universal_sentence_encoder([query]).numpy()
inputs = tokenizer("Sociology " + query, padding=True, truncation=True, return_tensors="tf", max_length=512)
specter_embedding = model(**inputs).last_hidden_state[:, 0, :].numpy()

CPU times: user 852 ms, sys: 101 ms, total: 953 ms
Wall time: 187 ms


### USE Abstract:

In [51]:
%%time
d, i = abstracts_index.search(query_use_embedding, 10)
articles.iloc[i.squeeze()]

CPU times: user 639 ms, sys: 21.4 ms, total: 661 ms
Wall time: 28.9 ms


Unnamed: 0,index,id,title,abstract,text
1694,1694,P1694,Action,"If a person's head moves, she may or may not h...",\n1. The Nature of Action and Agency\n\n\nIt h...
988,988,P988,Incompatibilist (Nondeterministic) Theories of...,To have free will is to have what it takes to ...,\n1. Noncausal Theories\n\nSome incompatibilis...
1691,1691,P1691,Action-based Theories of Perception,Action is a means of acquiring perceptual info...,\n1. Early Action-Based Theories\nTwo doctrine...
1328,1328,P1328,Higher-Order Theories of Consciousness,Higher-order theories of consciousness try to ...,\n1. Kinds of Consciousness\n\nOne of the adva...
1337,1337,P1337,Temporal Consciousness,"In ordinary conscious experience, consciousnes...",\n1. Three Models of Temporal Consciousness\n1...
1136,1136,P1136,Epiphenomenalism,Epiphenomenalism is the view that mental event...,\n1. Traditional Arguments (A) Pro\n\nMany phi...
479,479,P479,"Newton’s Views on Space, Time, and Motion",Isaac Newton founded classical mechanics on th...,"\n1. Overview of the Scholium\n\n Today, Newto..."
869,869,P869,Identity Over Time,Irving Copi once defined the problem of identi...,\n1. Introduction\n\nAs a number of philosophe...
1545,1545,P1545,Personal Autonomy,Autonomous agents are self-governing agents. B...,\n1. Introduction\n\n\nWhen people living in s...
515,515,P515,The Epistemic Condition for Moral Responsibility,Philosophers usually acknowledge two individua...,\n1. The Epistemic Condition\n1.1 Contents of ...


### USE Text

In [52]:
%%time
d, i = article_texts_index.search(query_use_embedding, 10)
articles.iloc[i.squeeze()]

CPU times: user 581 ms, sys: 0 ns, total: 581 ms
Wall time: 25.6 ms


Unnamed: 0,index,id,title,abstract,text
1578,1578,P1578,Aristotle’s Natural Philosophy,Aristotle had a lifelong interest in the study...,"\n1. Natures\n\n Nature, according to Aristotl..."
748,748,P748,Louis de La Forge,Louis de la Forge was among the first group of...,\n1. Life and Works\n\nLa Forge was born on 24...
1355,1355,P1355,Compatibilism,Compatibilism offers a solution to the free wi...,\n1. Free Will and the Problem of Causal Deter...
1238,1238,P1238,Descartes’ Physics,While René Descartes (1596–1650) is well-known...,\n1. A Brief History of Descartes’ Scientific ...
479,479,P479,"Newton’s Views on Space, Time, and Motion",Isaac Newton founded classical mechanics on th...,"\n1. Overview of the Scholium\n\n Today, Newto..."
1545,1545,P1545,Personal Autonomy,Autonomous agents are self-governing agents. B...,\n1. Introduction\n\n\nWhen people living in s...
1465,1465,P1465,Japanese Zen Buddhist Philosophy,Zen aims at the perfection of personhood. To t...,\n1. The Meaning of the Term Zen\n\nThe design...
720,720,P720,Antoine Le Grand,Antoine Le Grand (1629–1699) was a philosopher...,\n1. Life and Writings\n\n\nLe Grand lived in ...
1691,1691,P1691,Action-based Theories of Perception,Action is a means of acquiring perceptual info...,\n1. Early Action-Based Theories\nTwo doctrine...
838,838,P838,Space and Time: Inertial Frames,A “frame of reference” is a standard relative ...,\n1. Relativity and reference frames in classi...


### Specter embeddings

In [62]:
%%time
d, i = specter_index.search(specter_embedding, 10)
articles.iloc[i.squeeze()]

CPU times: user 665 ms, sys: 0 ns, total: 665 ms
Wall time: 28.3 ms


Unnamed: 0,index,id,title,abstract,text
1694,1694,P1694,Action,"If a person's head moves, she may or may not h...",\n1. The Nature of Action and Agency\n\n\nIt h...
6013,4760,W4760,Peter G. Ossorio,Peter G. Ossorio (4 May 1926 – 24 April 2007) ...,Peter G. Ossorio (4 May 1926 – 24 April 2007) ...
479,479,P479,"Newton’s Views on Space, Time, and Motion",Isaac Newton founded classical mechanics on th...,"\n1. Overview of the Scholium\n\n Today, Newto..."
1524,1524,P1524,Behaviorism,It has sometimes been said that “behave is wha...,\n1. What is Behaviorism?\n\n\nOne has to be c...
1691,1691,P1691,Action-based Theories of Perception,Action is a means of acquiring perceptual info...,\n1. Early Action-Based Theories\nTwo doctrine...
1430,1430,P1430,Margaret Lucas Cavendish,"Margaret Lucas Cavendish was a philosopher, po...",\n1. Introduction and Biography\n\nMargaret Lu...
10,10,P10,Agency,"In very general terms, an agent is a being wit...","\n1. Introduction\nIn a very broad sense, agen..."
1545,1545,P1545,Personal Autonomy,Autonomous agents are self-governing agents. B...,\n1. Introduction\n\n\nWhen people living in s...
1562,1562,P1562,Tense and Aspect,Time flies like an arrow.… Fruit flies like a ...,\n1. Introduction\n\nTense roughly means refer...
1146,1146,P1146,The Psychology of Normative Cognition,"From an early age, humans exhibit a tendency t...",\n1. A Psychological Capacity Dedicated to Nor...


In [33]:
abstracts_index.search?

In [31]:
abstracts.shape

(7221, 512)

Unnamed: 0,index,id,title,abstract,text
2568,969,W969,Béla Juhos,"Béla Juhos (November 22, 1901, Vienna – May 27...","Béla Juhos (22 November 1901, Vienna – 27 May ..."
6100,4860,W4860,Rudolf Burger,"Rudolf Burger (born December 8, 1938 in Vienna...","Rudolf Burger (born December 8, 1938 in Vienna..."
3192,1651,W1651,Michail Papageorgiou,Michail Papageorgiou (Greek: Μιχαήλ Παπαγεωργί...,Michail Papageorgiou (Greek: Μιχαήλ Παπαγεωργί...
6216,4990,W4990,Ludwig Landgrebe,"Ludwig Landgrebe (9 March 1902, Vienna – 14 Au...","Ludwig Landgrebe (9 March 1902, Vienna – 14 Au..."
4358,2927,W2927,Victor Kraft,Victor Kraft (4 July 1880 – 3 January 1975) wa...,"Victor Hugo Etler Kraftsov, known as Victor Kr..."
2852,1282,W1282,Elisabeth Samsonov,Elisabeth von Samsonow is an Austrian philosop...,Elisabeth von Samsonow is an Austrian philosop...
1962,291,W291,Friedrich Waismann,Friedrich Waismann (German: [ˈvaɪsman]; 21 Mar...,Friedrich Waismann (German: [ˈvaɪsman]; 21 Mar...
2271,630,W630,Krzysztof Michalski,Krzysztof Michalski (8 June 1948 – 11 February...,Krzysztof Michalski (8 June 1948 – 11 February...
2545,941,W941,Herbert Feigl,Herbert Feigl (; German: [ˈfaɪgl̩]; December 1...,Herbert Feigl (; German: [ˈfaɪgl̩]; December 1...
2973,1415,W1415,Christian von Ehrenfels,Christian von Ehrenfels (also Maria Christian ...,Christian von Ehrenfels (also Maria Christian ...
