# Look at uses of a target word "rational" over time

In [1]:
from __future__ import print_function
import time
import numpy as np
import pandas as pd
import pyarrow
import fastparquet
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
%matplotlib inline
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import seaborn as sns
import csv
import textwrap
from scipy.spatial.distance import cosine
import spacy
from collections import defaultdict 
from tqdm import tqdm

pd.set_option('display.max_colwidth', 500)

2024-03-29 02:36:57.376723: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Load in the file

In [2]:
target = "rational"

In [3]:
tokens = pd.read_csv('./data/logic_words/{}.csv'.format(target))

In [4]:
len(tokens)

1597

In [5]:
df = parquet_file = "/Volumes/data_gabriella_chronis/corpora/acl-publication-info.74k.parquet"

df = pd.read_parquet(parquet_file, engine='pyarrow')

Left hand join the large file to the token file. or do a constant lookup??. maybe just get the year columns

In [6]:
data = tokens.join(df.set_index("corpus_paper_id"), on="corpus_id")

Add a decade column 

In [7]:
data["year"] = data["year"].astype(int)

In [8]:
#data["decade"] = ( data['year'] //10)*10

#bins = pd.IntervalIndex.from_tuples([(0, 1), (2, 3), (4, 5)])
bins = [1950, 1960, 1970, 1980, 1990, 2000, 2005, 2010, 2012, 2014, 2016, 2018, 2020]
data["decade"] = pd.cut(data['year'], bins)

In [9]:
data["decade"].unique()

[(2000, 2005], (1990, 2000], (2005, 2010], (2016, 2018], (2010, 2012], ..., (2012.0, 2014.0], NaN, (2014.0, 2016.0], (1970.0, 1980.0], (1960.0, 1970.0]]
Length: 12
Categories (12, interval[int64, right]): [(1950, 1960] < (1960, 1970] < (1970, 1980] < (1980, 1990] ... (2012, 2014] < (2014, 2016] < (2016, 2018] < (2018, 2020]]

### Look at 10 example sentences from each decade

In [10]:
#df.style.set_properties(subset=['sentence'], **{'width': '300px'})
pd.set_option('display.max_rows', 1000)


data.groupby('decade').sample(5, replace=True) [['decade', 'sentence' ]]

ValueError: a must be greater than 0 unless no samples are taken

In [None]:
save = data.groupby('decade').sample(5) [['decade', 'sentence' ]]
save

rational | ˈraSH(ə)nəl |
adjective
1. based on or in accordance with reason or logic: I'm sure there's a perfectly rational explanation.
 - (of a person) able to think clearly, sensibly, and logically: Andrea's upset—she's not being very rational.
 - endowed with the capacity to reason: man is a rational being.
2. Mathematics (of a number, quantity, or expression) expressible, or containing quantities that are expressible, as a ratio of whole numbers. When expressed as a decimal, a rational number has a finite or recurring expansion.

### ACL Human

|decade | notes |
|---------|-------------------|
|1950 |  |
|1960 | |
|1970 | |
|1980 | |
|1990 |  |
|2000 | |
|2010 |  |
|2020 |  |

Main senses found in ACL are the logical formalism sense, the computer logic sense. Later on we get logic in the justifiable by reason sense, as it becomes a task. Initially, logical forms are a representation of natural language. The task is: can we model natural language using logical formalisms? These kinds of logics are seen as insufficient with the advent of feature-based statistical methods. There is a switch, and the task is: can we model logical processes (of thought) using other kinds of representations--statistical representations. 

There's another potential change here in the extension of computer logic to 'business logic'---a term which can be specific to the logic of a business encoded in a particular program or a more abstract process that can have some digital and some analog components but which is supposed to operate with the regularity of an algorithm. 

## COCA HUMAN

|senses | snippets|
|---|--|
|system of codification or set of principles (often to point out a flaw; limited or not totalizing systems of reason) | by this logic, with this kind of logic, the logic employed to suggest continuity w/ populism|
|symbolic/mathematical | Isn't logic required by math , Math is based on logic, curses aren't. |
| justifiable by reason| there had to be some logic left in the world |
| suggested course of action | wealth-creation logic, logic that constructs and maintains sustemic racism in Bolivia |
| computer program | | 

synonyms: sagacity, wisdom, soundness, judgment, rationality, coherence, chain of reasoning, argument, dialectics, deduction

Would we call these examples of the formalism sense of logic polysemous? Let's see if we can make the same subtitutions.

1.  In this way the LLFs have a more natural appearance than, for instance, the formulas of *first order logic*. (ACL, emphasis added)
   - (a)   the formulas of deduction
   - (b) * the formulas of wisdom

2.  what with your well-honed expertise in "freshman logic" (COCA, emphasis adde)
   - (a)   expertise in deduction
   - (b)   expertise in wisdom

(I realize they aren't the same, but they could very well be paraphrases, given that FOL is the standard in freshman logic)

The point is that the potential substitutions in these otherwise same senses don't line up. 

Let's try the same with another sense, the 

1. We believe that either a three-way modal logic entailment task or a two-way probabilistic *logic entailment* task on its own could make perfect sense. (ACL)
  - (a)  chain of reasoning entailment task
  - (b)  ? argument entailment task
  - (c)  deduction entailment task
  - (d)  rationality entailment task
  - (e)  sagacity entailment task

2. ... never play it), the barrier to RPGs is more knowing their rules and logic and how to find things in menus. Broadly speaking, FPSs feel more like ... (COCA)
  - (a)  knowing their rules and chain of reasoning
  - (b)  knowing their rules and argument
  - (c)  knowing their rules and deduction
  - (d)  knowing their rules and rationality
  - (e) ? knowing their rules and sagacity

3. More complex cases are still based on the usual rules of *propositional logic* such as modus ponens, ((p~q).q) ~q). (ACL)
  - (a)  the usual rules of *propositional chain of reasoning*
  - (b)  the usual rules of *propositional argument*
  - (c)  the usual rules of *propositional deduction*
  - (d)  ? the usual rules of *propositional rationality*
  - (e)  ?? the usual rules of *propositional sagacity*

4. That is the logic Truman used to justify bombing Hiroshima (COCA)
 - (a) the chain of reasoning that truman used
 - (b) the argument that truman used
 - (c) the wisdom that truman used (the semantic felicity here would seem to depend on moral/ethical stance)
 - (d) the deduction that truman used
 - (e) the sagacity that truman used

Sense (1) falls more closely into the "justifiable by reason" sense than the "formalism" sense most commonly used in ACL publications. Perhaps it's not exactly the same flavor as the "justifiable by reason" sense used in 4. In any case, I imagine with the advent of connectionism and feture-based statistical machine learning, one sees a tendency towards the logic as a system of reasoning to be represented---the object being modeled---as opposed to logic as the model. The sense of logic as a system of inference or a course of action made necessary by application of logical methods is one which is not totalizing. These are partial logics, and often referred to point out a flaw or a limitation or a blind-spot in a particular line of reasoning. When this sense is used in machine learning, I hypothesize that this tendency will be less prevalent, due to the emergence of this use out of a meaning of logic which is supposed to be totalizing---a totalizing model of grammar. 
