# Argument Mining
This notebook is using the spacy_arguing_lexicon python package which is a spacy extension using the [MPQA arguing lexicon](http://mpqa.cs.pitt.edu/lexicons/arg_lexicon/)
These are the specific terms that are identified within the arguments, based on the above:
 - Assessments
 - Authority
 - Conditional
 - Contrast
 - Doubt
 - Emphasis
 - Generalisation
 - inyourshoes
 - inconsistency
 - necessity
 - possibility
 - priority
 - rhetorical_question
 - wants
 - difficulty
 
 For more detail look here: https://github.com/fako/spacy_arguing_lexicon/blob/master/spacy_arguing_lexicon/lexicon/README%20for%20Arguing%20Lexicon.pdf

In [4]:
import spacy
from spacy_arguing_lexicon import ArguingLexiconParser
from spacy.language import Language



In [5]:
nlp = spacy.load("en_core_web_sm")

In [6]:
@Language.factory('ArguingLexiconParser', default_config={"lang":nlp.lang})

def CreateArguingLexiconParser(nlp, name, lang):
    return ArguingLexiconParser()

In [7]:
nlp.add_pipe('ArguingLexiconParser')

<spacy_arguing_lexicon.parsers.ArguingLexiconParser at 0x7f0099966160>

In [1]:
contents = ("""
    "Another strand of criticism aimed at GPT-3 and other LLMs is that the results they produce often tend to display toxicity and reproduce ethnic, racial, and other bias. This really comes as no surprise, keeping in mind where the data used to train LLMs is coming from: the data is all generated by people, and to a large extent it has been collected from the web. Unless corrective action is taken, its entirely expectable that LLMs will produce such output.Last but not least, LLMs take lots of resources to train and operate. Chomskys aphorism about GPT-3 is that ""its only achievement is to use up a lot of Californias energy"". But Chomsky is not alone in pointing this out. In 2022, DeepMind published a paper, ""Training Compute-Optimal Large Language Models,"" in which analysts claim that training LLMs has been done with a deeply suboptimal use of compute."
""")

In [33]:
contents = ("""
Well, our understanding was that we could buy that here.
In my opinion, that is not a good idea.
So, my take on this whole thing is as follows.
You know it seemed to me that we could do this faster.
It would seem to me that there is more to it than that.
It would appear to us that his heart's not in it.
It appears to me that fashion was the farthest thing from Jesus' mind.
My point is that you can't compare the capabilty here.
My main point is that you can't compare the two.
When I looked at EWS it looked to me as if it did only 2 things.
I get the impression that Robin is not a Mac enthusiast.
Atiku Abubakar, a second opposition candidate, said: "My impression is that this is a sham. It is no election."
In my book this is Sasha's best mix. It is quite simply outstanding.
And, to our mind, this is the only correct mode for working with CVS.
To our way of thinking, she is a quiet revolutionary.
As far as I am concerned there are some bloggers who just plain don't exist to me.
Pretty wonderful (if you ask me...)
On the other hand, my feeling is that robotically-added links should not automatically confer PageRank.
Not from where I am sitting, but does that mean anything?
RUMSFELD: Senator, I don't think that's true. I have never painted a rosy picture.
All I'm Saying is that I'm hiking exclusively in no-dogs-allowed parks for some time to come.
What we're saying is that there's a massive amount of disquiet about British foreign policy.
We're not saying that you don't need coal, but when you do mine the coal there are responsibilities to it. 
I guess what I'm trying to say is, if I can change, and you can change, everybody can change.
So what I mean is that they use the rhetoric of sexual liberation, empowerment. 

According to WHO recommendations, male circumcision should be followed by at least six weeks of abstinence.

If we want to use all the new features, we need to kill off legacy browsers that don't support it.
We need to kill off legacy browsers, if we want to use all the new features.
So, it would be realy nice if somebody could explain this to me! 
Wouldn't it be nice if we could just pass a law and fix America's schools? 
We want a bearded lady, an elephant, and if we could get midgets, that would be nice.
It won't work real well unless the parts are really distinct.
Unless you run the ball, any offense with poor QB play won't work real well. 
Actually, such a decomposition has to be done in order to support usability inspection methods and to identify and fix faults. 
In order to succeed there has to be development, says Ms. Papastavrou.
As long as we believe he should have found a "better" occupation, we will be reluctant to accept his choice of professions. 
We will be reluctant to accept his choice of professions, as long as we believe he should have found a "better" occupation, . 
We better hang out more or else I am dis-owning my bro.
It's pretty essential if most of your forums are post moderated, otherwise there'd be lots of double posting and confused people wondering where their posts.

uh as opposed to what you'd really want to know if you were gonna use this thing
um i think they should be recorded instead of written
so rather than say the most interesting thing something interesting
there's a median filtering and then there's a piece-wise linear fit based on some criteria
that's just a whole nother issue
there's no i mean the language model for switchboard is totally different
It's a whole new ballgame.
o_k so that's a that's a separate issue

so i see this like a you know a a topic i'm not convinced about
uh r- uh then i don't see how that would make it louder
i mean it's not clear that these musical noises hurt us in recognition
I doubt that the suns can advance.
We are a bit doubtful about that.

I have no doubt about it.
I say pretended because well, when you really think about it hating takes a lot of bitterness and resentment.
It is pretty obvious that the debasement of the human mind caused by a constant flow of fraudulent advertising is no trivial thing.
If you see the logo you'll know your digital music will play for sure.
I am confident that with good will on both sides that we can move beyond political statements and agree on a bill that gives our troops the funds 
If Google indexed only 100000 documents, would it be nearly as useful as it is today? Of course not.
There's no doubt about it, this is best retro compilation ever!
We say "without a doubt" to express that something is certain.
I bet that Holmes will have an accident too. I should go to lunch.
Kinky Friedman's candidacy is bound to be something; what that something is is still up for debate. 
If we really want to shift the reality in Iraq, Palestine and Lebanon and elsewhere, there's no two ways about it.
Sometimes there are no two ways about it. Some problems have only one solution.
Wood is a cantankerous substance; there's no two ways about it. 
One issue might be that some 32-bits ppc cpu might have more than one DABRs.
My feeling is that the sanctimonious one here might just be the same person who fails to recognize tongue-in-cheek when he sees it. 
Bees sense quantum fields, and that's why they are disappearing.
Bees sense quantum fields, and that is why they are disappearing
But seriously, the idea here is NOT to just stick the transmitter under your laces.
My whole point is that endings imply unity or wholeness in a duration of time that can only be achieved by some agency.
So I think what you have to do is to have the attitude that whatever you have that particular night you have to find a way to win. 
The reason is that Barton wants to use the British India story to argue that the pith of American conservation lies in imperial forestry.
Here's What the Billionaires Are Buying.
Here is what the couple thinks about war.
Paris is going to be staring in the remake of that old Linda Blair Movie.
I think in the end what will happen is that the end of crippled, RAM-sharing onboard video is near. 
I was raised to believe what's gonna happen is what's gonna happen.
I was raised to believe what is going to happen is what is going to happen.
I want to highlight yet again to the yet-to-be converted, that Wordpress is a compelling content management system (CMS) and not just a platform for blogs.

Does the computer do anything at all ie. say it needs drivers? 
The happiest man in the world? 
The 500 Greatest Albums of All Times
This must be the most unwanted console in recent memory.
In living history, the Etosha Pan has never been filled with water.

Mostly, what I would do if I owned a newspaper is make the newspaper a tool both for and of the populace of the community it served. 
If I were you I would take a break from him for a week so you both can figure things out.
I would not use this company again.
I wouldn't use this company again.

No sanction is imposed, except that Mr. Mangan shall comply with the orders contained herein.
Except for a small amount of effluent displaced during charging (usually 50 % of charge volume), there appears to be no sediments in the digester.
With the exception of Bernadine, I never felt anything for the women populating this film, as they failed to capture my interest or sympathy.
That said, there is a tendency to help the large industrial conglomerate more quickly than the small company you have never heard of.
That having been said, there is a forum where we discuss this sort of thing, and if you have questions you're more than welcome to post there.
That being said, you can be sure that our love story has only just begun
In spite of a sluggish housing market, there was employment growth in the industry last year.
Even so, we received thousands of submissions and we are very thankful for everyone who voluntarily submitted.
But at the same time, when it comes to private Good Samaritan undertakings that do alleviate poverty and despair, Americans are second to none.
But Wait A Minute! Isn't George W. Bush somehow not allowed to suggest folks "conserve gasoline" in the face of rising prices?
Well, hold on a second ... what's this about driving traffic to your website? 
Hold on a sec - I wouldn't say this means it is resolved. 
It's just that everyone's interest is stronger than mine.
That's all well and good... ...but if society is becoming more prosperous, why do young people seem less happy?
Here again is a reasonable point of view as far as it goes, but whether it goes as far and as deep as Chretien's is a matter for surmise.
You might think that empiricism is a worthy but rather dull and eclectic kind of philosophy.
You may think that empiricism is a worthy but rather dull and eclectic kind of philosophy.

It's a must.
Milk is a necessity
We can't do without that stuff.
Hun, you got to forget about that.
You had better get going.
Now you have to do it again.
It has to be stronger.
They need to try again.
It needs to be tighter.
They ought to reimburse you.
You better get going.

Then you can use the red lights.
Maybe we can try speech recognition.
You can't have the speech recognition without the speaker.
You cannot drop it from the design at this point.
We can't make changes now.
We cannot make changes now
You could use solar.
We could add a toaster functionality.
It's not able to change color.
There's no way that they'll increase the budget.
I don't see any way to do that.
There's just no way.

First and foremost, this is a deliberative body.
Now remember that your parents are watching.
Keep in mind that they have tried this before.
Don't forget that it has to be within the budget.
Let's not forget that we have only 15 euros to play with.
Let's keep in mind what Bill told us.
Let's remember to discuss the color choice again.
#13. rhetoricalquestion
So do we really need voice recognition?
Why not make it yellow?
Why don't you put it at the bottom?
What if the lcd has no color?
And who doesn't like a barking remote control?

But how did we get so benighted in the first place?
First of all, this election was definitely rigged. 

You don't want to annoy the boss.
People tell me I'll like it there... so much that I might not wanna come back. 

It'll be easy
That was a breeze
It was a real walk in the park
That was a piece of cake!
It's gonna be a snap.
It was a cinch.
That's child's play
That's difficult
It was a pain
He is a pain in the butt.
That thing is a bastard to drive.
It was no picnic, let me tell you.
That's gonna be tricky.
It was very arduous.
That'll be quite a challenge
It's very challenging.
We had a hard time understanding her.""")

In [28]:
contents = "so i see this like a you know a a topic i'm not convinced about"

In [2]:
sentences = contents.split(".")

In [3]:
for sentence in sentences:
    print(sentence)


    "Another strand of criticism aimed at GPT-3 and other LLMs is that the results they produce often tend to display toxicity and reproduce ethnic, racial, and other bias
 This really comes as no surprise, keeping in mind where the data used to train LLMs is coming from: the data is all generated by people, and to a large extent it has been collected from the web
 Unless corrective action is taken, its entirely expectable that LLMs will produce such output
Last but not least, LLMs take lots of resources to train and operate
 Chomskys aphorism about GPT-3 is that ""its only achievement is to use up a lot of Californias energy""
 But Chomsky is not alone in pointing this out
 In 2022, DeepMind published a paper, ""Training Compute-Optimal Large Language Models,"" in which analysts claim that training LLMs has been done with a deeply suboptimal use of compute
"



In [8]:
for sentence in sentences:
    try:
        print("******************")
        print(sentence)
        doc = nlp(sentence)
        argument_span = next(doc._.arguments.get_argument_spans())
        print("Argument lexicon:", argument_span.text)
        print("Label of lexicon:", argument_span.label_)
        print("Sentence where lexicon occurs:", argument_span.sent.text.strip())
    except StopIteration:
        # With the above call, if no argument is found, the loop keeps on looping. When python finds this we can continue.
        print("No argument found in sentence")
        continue

print("continuing here as proof that we've been able to break out of the loop without exiting the program.")
    

******************

    "Another strand of criticism aimed at GPT-3 and other LLMs is that the results they produce often tend to display toxicity and reproduce ethnic, racial, and other bias
No argument found in sentence
******************
 This really comes as no surprise, keeping in mind where the data used to train LLMs is coming from: the data is all generated by people, and to a large extent it has been collected from the web
Argument lexicon: really
Label of lexicon: contrast
Sentence where lexicon occurs: This really comes as no surprise, keeping in mind where the data used to train LLMs is coming from: the data is all generated by people, and to a large extent it has been collected from the web
******************
 Unless corrective action is taken, its entirely expectable that LLMs will produce such output
No argument found in sentence
******************
Last but not least, LLMs take lots of resources to train and operate
No argument found in sentence
******************
 Choms

In [9]:
# Loading in the example texts, some of the examples are not coming out as they say they should be, instead being taken as something else ie the doubt columns are getting taken as causation, one starts with 'so' for example
arg_dict = {
    "assessments":0,
    "doubt":0,
    "authority":0,
    "emphasis":0,
    "necessity":0,
    "causation":0,
    "generalization":0,
    "structure":0,
    "conditionals":0,
    "inconsistency":0,
    "possibility":0,
    "wants":0,
    "contrast":0,
    "priority":0,
    "difficulty":0,
    "inyourshoes":0,
    "rhetoricalquestion":0
}

for sentence in sentences:
    try:
        doc = nlp(sentence)
        argument_span = next(doc._.arguments.get_argument_spans())

        arg_dict[argument_span.label_] += 1
        
    except StopIteration:
        continue

for key, value in arg_dict.items():
    print(key, value)

assessments 0
doubt 0
authority 0
emphasis 0
necessity 0
causation 0
generalization 0
structure 0
conditionals 0
inconsistency 0
possibility 0
wants 0
contrast 1
priority 0
difficulty 0
inyourshoes 0
rhetoricalquestion 0


## Roberta-Argument 

Trying one that runs on pyTorch. May nuke my computer in the process.

In [50]:
import torch
from transformers import pipeline
import pandas as pd


In [44]:
classifier = pipeline("sentiment-analysis", model="chkla/roberta-argument")

Downloading:   0%|          | 0.00/499M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/255 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/798k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/239 [00:00<?, ?B/s]

In [47]:
roberta_contents = ("""
Well, our understanding was that we could buy that here.
In my opinion, that is not a good idea.
So, my take on this whole thing is as follows.
You know it seemed to me that we could do this faster.
It would seem to me that there is more to it than that.
It would appear to us that his heart's not in it.
It appears to me that fashion was the farthest thing from Jesus' mind.
My point is that you can't compare the capabilty here.
My main point is that you can't compare the two.
When I looked at EWS it looked to me as if it did only 2 things.
I get the impression that Robin is not a Mac enthusiast.
Atiku Abubakar, a second opposition candidate, said: "My impression is that this is a sham. It is no election."
In my book this is Sasha's best mix. It is quite simply outstanding.
And, to our mind, this is the only correct mode for working with CVS.
To our way of thinking, she is a quiet revolutionary.
As far as I am concerned there are some bloggers who just plain don't exist to me.
Pretty wonderful (if you ask me...)
On the other hand, my feeling is that robotically-added links should not automatically confer PageRank.
Not from where I am sitting, but does that mean anything?
RUMSFELD: Senator, I don't think that's true. I have never painted a rosy picture.
All I'm Saying is that I'm hiking exclusively in no-dogs-allowed parks for some time to come.
What we're saying is that there's a massive amount of disquiet about British foreign policy.
We're not saying that you don't need coal, but when you do mine the coal there are responsibilities to it. 
I guess what I'm trying to say is, if I can change, and you can change, everybody can change.
So what I mean is that they use the rhetoric of sexual liberation, empowerment. 

According to WHO recommendations, male circumcision should be followed by at least six weeks of abstinence.

If we want to use all the new features, we need to kill off legacy browsers that don't support it.
We need to kill off legacy browsers, if we want to use all the new features.
So, it would be realy nice if somebody could explain this to me! 
Wouldn't it be nice if we could just pass a law and fix America's schools? 
We want a bearded lady, an elephant, and if we could get midgets, that would be nice.
It won't work real well unless the parts are really distinct.
Unless you run the ball, any offense with poor QB play won't work real well. 
Actually, such a decomposition has to be done in order to support usability inspection methods and to identify and fix faults. 
In order to succeed there has to be development, says Ms. Papastavrou.
As long as we believe he should have found a "better" occupation, we will be reluctant to accept his choice of professions. 
We will be reluctant to accept his choice of professions, as long as we believe he should have found a "better" occupation, . 
We better hang out more or else I am dis-owning my bro.
It's pretty essential if most of your forums are post moderated, otherwise there'd be lots of double posting and confused people wondering where their posts.

uh as opposed to what you'd really want to know if you were gonna use this thing
um i think they should be recorded instead of written
so rather than say the most interesting thing something interesting
there's a median filtering and then there's a piece-wise linear fit based on some criteria
that's just a whole nother issue
there's no i mean the language model for switchboard is totally different
It's a whole new ballgame.
o_k so that's a that's a separate issue

so i see this like a you know a a topic i'm not convinced about
uh r- uh then i don't see how that would make it louder
i mean it's not clear that these musical noises hurt us in recognition
I doubt that the suns can advance.
We are a bit doubtful about that.

I have no doubt about it.
I say pretended because well, when you really think about it hating takes a lot of bitterness and resentment.
It is pretty obvious that the debasement of the human mind caused by a constant flow of fraudulent advertising is no trivial thing.
If you see the logo you'll know your digital music will play for sure.
I am confident that with good will on both sides that we can move beyond political statements and agree on a bill that gives our troops the funds 
If Google indexed only 100000 documents, would it be nearly as useful as it is today? Of course not.
There's no doubt about it, this is best retro compilation ever!
We say "without a doubt" to express that something is certain.
I bet that Holmes will have an accident too. I should go to lunch.
Kinky Friedman's candidacy is bound to be something; what that something is is still up for debate. 
If we really want to shift the reality in Iraq, Palestine and Lebanon and elsewhere, there's no two ways about it.
Sometimes there are no two ways about it. Some problems have only one solution.
Wood is a cantankerous substance; there's no two ways about it. 
One issue might be that some 32-bits ppc cpu might have more than one DABRs.
My feeling is that the sanctimonious one here might just be the same person who fails to recognize tongue-in-cheek when he sees it. 
Bees sense quantum fields, and that's why they are disappearing.
Bees sense quantum fields, and that is why they are disappearing
But seriously, the idea here is NOT to just stick the transmitter under your laces.
My whole point is that endings imply unity or wholeness in a duration of time that can only be achieved by some agency.
So I think what you have to do is to have the attitude that whatever you have that particular night you have to find a way to win. 
The reason is that Barton wants to use the British India story to argue that the pith of American conservation lies in imperial forestry.
Here's What the Billionaires Are Buying.
Here is what the couple thinks about war.
Paris is going to be staring in the remake of that old Linda Blair Movie.
I think in the end what will happen is that the end of crippled, RAM-sharing onboard video is near. 
I was raised to believe what's gonna happen is what's gonna happen.
I was raised to believe what is going to happen is what is going to happen.
I want to highlight yet again to the yet-to-be converted, that Wordpress is a compelling content management system (CMS) and not just a platform for blogs.

Does the computer do anything at all ie. say it needs drivers? 
The happiest man in the world? 
The 500 Greatest Albums of All Times
This must be the most unwanted console in recent memory.
In living history, the Etosha Pan has never been filled with water.

Mostly, what I would do if I owned a newspaper is make the newspaper a tool both for and of the populace of the community it served. 
If I were you I would take a break from him for a week so you both can figure things out.
I would not use this company again.
I wouldn't use this company again.

No sanction is imposed, except that Mr. Mangan shall comply with the orders contained herein.
Except for a small amount of effluent displaced during charging (usually 50 % of charge volume), there appears to be no sediments in the digester.
With the exception of Bernadine, I never felt anything for the women populating this film, as they failed to capture my interest or sympathy.
That said, there is a tendency to help the large industrial conglomerate more quickly than the small company you have never heard of.
That having been said, there is a forum where we discuss this sort of thing, and if you have questions you're more than welcome to post there.
That being said, you can be sure that our love story has only just begun
In spite of a sluggish housing market, there was employment growth in the industry last year.
Even so, we received thousands of submissions and we are very thankful for everyone who voluntarily submitted.
But at the same time, when it comes to private Good Samaritan undertakings that do alleviate poverty and despair, Americans are second to none.
But Wait A Minute! Isn't George W. Bush somehow not allowed to suggest folks "conserve gasoline" in the face of rising prices?
Well, hold on a second ... what's this about driving traffic to your website? 
Hold on a sec - I wouldn't say this means it is resolved. 
It's just that everyone's interest is stronger than mine.
That's all well and good... ...but if society is becoming more prosperous, why do young people seem less happy?
Here again is a reasonable point of view as far as it goes, but whether it goes as far and as deep as Chretien's is a matter for surmise.
You might think that empiricism is a worthy but rather dull and eclectic kind of philosophy.
You may think that empiricism is a worthy but rather dull and eclectic kind of philosophy.

It's a must.
Milk is a necessity
We can't do without that stuff.
Hun, you got to forget about that.
You had better get going.
Now you have to do it again.
It has to be stronger.
They need to try again.
It needs to be tighter.
They ought to reimburse you.
You better get going.

Then you can use the red lights.
Maybe we can try speech recognition.
You can't have the speech recognition without the speaker.
You cannot drop it from the design at this point.
We can't make changes now.
We cannot make changes now
You could use solar.
We could add a toaster functionality.
It's not able to change color.
There's no way that they'll increase the budget.
I don't see any way to do that.
There's just no way.

First and foremost, this is a deliberative body.
Now remember that your parents are watching.
Keep in mind that they have tried this before.
Don't forget that it has to be within the budget.
Let's not forget that we have only 15 euros to play with.
Let's keep in mind what Bill told us.
Let's remember to discuss the color choice again.
#13. rhetoricalquestion
So do we really need voice recognition?
Why not make it yellow?
Why don't you put it at the bottom?
What if the lcd has no color?
And who doesn't like a barking remote control?

But how did we get so benighted in the first place?
First of all, this election was definitely rigged. 

You don't want to annoy the boss.
People tell me I'll like it there... so much that I might not wanna come back. 

It'll be easy
That was a breeze
It was a real walk in the park
That was a piece of cake!
It's gonna be a snap.
It was a cinch.
That's child's play
That's difficult
It was a pain
He is a pain in the butt.
That thing is a bastard to drive.
It was no picnic, let me tell you.
That's gonna be tricky.
It was very arduous.
That'll be quite a challenge
It's very challenging.
We had a hard time understanding her.""")

In [48]:
sentences = roberta_contents.split(".")

In [68]:
skeleton = {'sentence':[],'label':[],'score':[]}
df = pd.DataFrame(skeleton)

for sentence in sentences:
    try:
        result = classifier(sentence)
        new_row = {'sentence':sentence, 'label':list(result[0].values())[0], 'score':list(result[0].values())[1]}
        new_df = pd.DataFrame([new_row])
        df = pd.concat([df, new_df], axis=0, ignore_index=True)

    except StopIteration:
        continue

In [73]:
pd.set_option('display.max_colwidth', None)
df.loc[df['label'] == 'ARGUMENT'].head(10)

Unnamed: 0,sentence,label,score
26,"\nWe're not saying that you don't need coal, but when you do mine the coal there are responsibilities to it",ARGUMENT,0.769263
34,"\nUnless you run the ball, any offense with poor QB play won't work real well",ARGUMENT,0.548608
47,\nIt is pretty obvious that the debasement of the human mind caused by a constant flow of fraudulent advertising is no trivial thing,ARGUMENT,0.865983
87,"\nBut at the same time, when it comes to private Good Samaritan undertakings that do alleviate poverty and despair, Americans are second to none",ARGUMENT,0.664966
104,\nMilk is a necessity\nWe can't do without that stuff,ARGUMENT,0.649553
118,\nWe cannot make changes now\nYou could use solar,ARGUMENT,0.871053
