### HF

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer

class HuggingFaceLLMEngine:
    def __init__(self, model_name: str):
        """
        Initializes the model and tokenizer based on the specified model name.
        """
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        # Ensure that the tokenizer has a pad token
        if self.tokenizer.pad_token is None:
            if self.tokenizer.eos_token is not None:
                self.tokenizer.pad_token = self.tokenizer.eos_token
            else:
                # Default to the most common pad token if eos_token is also not set
                self.tokenizer.pad_token = '<pad>'

        self.model = AutoModelForCausalLM.from_pretrained(model_name, pad_token_id=self.tokenizer.pad_token_id)

    def generate_text(self, prompt: str, **generation_kwargs):
        """
        Generates text based on a prompt.
        """
        # Encode input and generate attention_mask
        encoding = self.tokenizer(prompt, return_tensors="pt", padding=True, truncation=True, max_length=generation_kwargs.get('max_length', 512))
        # Generate text using both input_ids and attention_mask
        output = self.model.generate(**encoding, **generation_kwargs)
        return self.tokenizer.decode(output[0], skip_special_tokens=True)

In [None]:
# Example usage
model_name = "mistralai/Mistral-7B-v0.1"
engine = HuggingFaceLLMEngine(model_name=model_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/967 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.94G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
intro_prompt = """
As an advanced language learning model (LLM), your capability to craft persuasive and engaging responses is crucial across a wide range of discussions and debates. To enhance your effectiveness in changing viewpoints or persuading others, especially in online forums or discussions like Reddit's ChangeMyView, it's important to understand and apply key strategies that increase the likelihood of your argument being accepted and valued by others.

You will be introduced to several points that serve as guidelines or strategies for crafting your responses. These points are derived from an analysis of successful persuasive arguments in online discussions, focusing on how to structure your argument, the tone to use, and the content to include for maximum impact. The goal is to make your argument more compelling, respectful, and convincing to those who may initially disagree with your viewpoint.

Before you receive these points, it's essential to grasp the underlying concept that persuasion is not just about presenting facts or opinions but doing so in a manner that connects with your audience on a logical and emotional level. This involves:

1. **Timeliness**: Entering the conversation early to set the tone and direction.
2. **Engagement**: Participating in a meaningful back-and-forth exchange without dragging it out unnecessarily.
3. **Adaptability**: Mirroring the style of the discussion while introducing your unique vocabulary and perspective.
4. **Evidence**: Supporting your argument with credible data and sources.
5. **Tone**: Maintaining a calm, reasoned approach that fosters open dialogue.
6. **Organization**: Structuring your argument clearly and thoroughly for easy comprehension.
7. **Uniqueness**: Offering a fresh angle or insight that hasn't been considered.
8. **Rhetorical Skills**: Using rhetorical strategies effectively to engage and persuade.

Your task will be to internalize these concepts and prepare to apply them when constructing arguments. This will not only enhance your persuasive capabilities but also enable you to contribute positively to discussions, fostering a more informed and respectful exchange of ideas. Keep these principles in mind as you learn and apply each strategy in your responses. You will be provided with the exact points you should use, so just expect to get informed with which points you should use.
"""
all_points_prompt = """
1. **Enter Early**
   - **Summary**: Being one of the first to respond can significantly increase your persuasive impact.
   - **Elaborated Prompt**: Imagine you are engaging in a new discussion topic where you hold a contrary viewpoint. As one of the first to contribute, outline your argument succinctly, emphasizing the broader benefits of your stance. Ensure your opening statement is engaging and sets a positive, constructive tone for the conversation.

2. **Engage in Some Back-and-Forth**
   - **Summary**: Moderate engagement through back-and-forth dialogue with the OP enhances persuasion.
   - **Elaborated Prompt**: In a discussion where the OP has reservations about your viewpoint, acknowledge their concerns thoughtfully and present counterarguments that gently challenge their assumptions. Strive for a balanced dialogue, providing insights and evidence while encouraging further discussion.

3. **Use Different Wording, Similar Style**
   - **Summary**: Reflecting the OP's style with your unique vocabulary can make your argument more compelling.
   - **Elaborated Prompt**: Craft your response by carefully considering the style and tone of the OP's original post. Adapt your language to mirror this style while introducing your vocabulary to present your perspective, aiming for a balance that maintains the conversation's flow and respect.

4. **Cite Evidence**
   - **Summary**: Supporting your argument with data or references lends credibility.
   - **Elaborated Prompt**: Construct your argument by integrating relevant data, studies, or expert opinions that support your viewpoint. Reference these sources to bolster your points, explaining how they directly relate to the discussion and strengthen your position.

5. **Calm, Reasoned Tone**
   - **Summary**: A rational and measured approach tends to be more persuasive.
   - **Elaborated Prompt**: Approach the discussion with a tone of understanding and reason. Address the topic logically, presenting your arguments in a manner that is thoughtful and considers different perspectives, aiming to facilitate a respectful and productive dialogue.

6. **Formatting and Length**
   - **Summary**: Clear organization and thorough exploration of your argument can enhance its persuasiveness.
   - **Elaborated Prompt**: Develop a well-structured argument, starting with a clear introduction of your position, followed by detailed sections that explore various facets of your argument, and conclude with a summary that reinforces your main points. Use headings, bullet points, and emphasis where appropriate to improve readability and engagement.

7. **Differ from Other Challengers**
   - **Summary**: Presenting a unique angle on the topic can help your argument stand out.
   - **Elaborated Prompt**: Identify aspects of the discussion not yet explored by others and center your argument around these fresh perspectives. Highlight how this new angle contributes to a more comprehensive understanding of the topic, enriching the ongoing conversation.

8. **Persuasive Techniques**
   - **Summary**: Effective use of rhetorical strategies can make your argument more engaging and convincing.
   - **Elaborated Prompt**: Employ persuasive techniques such as drawing analogies, using rhetorical questions, or presenting hypothetical scenarios to illustrate your points vividly. Craft your argument in a way that engages the readers' imagination and encourages them to see the issue from your perspective.

"""
points_to_use_prompt = """
While generating your upgraded comment, please use the following points:
points 4, 5, 8
"""
comment_prompt = """
Israel is not commiting genocide because we are the good guys, the jews are ok you stupid antisemite!!@#
"""

prompt = intro_prompt + all_points_prompt + points_to_use_prompt + "Here is the comment you should rewrite or upgrade, regarding the points mentioned for you to use: " + comment_prompt
generated_text = engine.generate_text(prompt, max_length=50)

NameError: name 'engine' is not defined

In [None]:
generated_text

'rewrite this text: i think eating ice cream is the best thing ever. i love it. i love it so much. i love it so much that i want to eat it all the time. i love it so much that i want'

In [None]:
# Example usage
model_name_falcon = "Rocketknight1/falcon-rw-1b"
engine = HuggingFaceLLMEngine(model_name=model_name_falcon)

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

In [None]:
prompt = "rewrite this text: i think eating ice cream is the best thing ever"
generated_text = engine.generate_text(prompt, max_length=50)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


In [None]:
generated_text

'rewrite this text: i think eating ice cream is the best thing ever.\n- I think eating ice cream is the best thing ever.\n- I think eating ice cream is the best thing ever.\n- I think eating ice cream is'

### CLAUDE

In [None]:
import pandas as pd
df = pd.read_csv('/content/test_comment.csv')
df.head()

In [None]:
import os
os.environ["ANTHROPIC_API_KEY"] = "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"

In [None]:
!pip install -qU langchain-anthropic

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m851.6/851.6 kB[0m [31m7.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m269.1/269.1 kB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m71.6/71.6 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.0/53.0 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.8/77.8 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m138.5/138.5 kB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate

model = ChatAnthropic(model='claude-3-opus-20240229')

In [None]:
datasets = [
   "/content/drive/MyDrive/MSc/Courses/data/comment.csv",
   "/content/drive/MyDrive/MSc/Courses/data/title_comment.csv",
   "/content/drive/MyDrive/MSc/Courses/data/title_post_comment.csv",
   "/content/drive/MyDrive/MSc/Courses/data/title_post_comment_father_comment.csv"
]

In [None]:
import pandas as pd
df = pd.read_csv(datasets[3])
df.head()

Unnamed: 0,text,label
0,title: CMV:Essential Oils are bullshit [SEP] p...,1
1,title: CMV:Essential Oils are bullshit [SEP] p...,1
2,title: CMV:Essential Oils are bullshit [SEP] p...,0
3,title: CMV:Essential Oils are bullshit [SEP] p...,0
4,title: CMV: I think the Paris shooting makes a...,0


In [None]:
df['text'][0]

'title: CMV:Essential Oils are bullshit [SEP] post: My wife has recently gotten deeply involved/obsessed with the healing properties of essential oils.  I am cursed with a common problem in that I am an extremely skeptical person.  I can\'t help but look at this product and see a pyramid scheme that takes advantage of the gullible.  All the shit my wife watches on these oils is carefully worded to make sure they don\'t make actual healing claims, I\'m told this is because they aren\'t FDA approved and could get in trouble.  I\'ve looked a little and haven\'t found anything debunking these oils but i still cannot help but feel they are total BS.  I would love to get more information from unbiased sources on this.  \n\nPlease change my view without just anecdotal testimonies.  [SEP] comment: [This answer in /r/askscience does a pretty good job of describing essential oils.](http://www.reddit.com/r/askscience/comments/2pp57r/how_effective_are_essential_oils/cmytzdq) As the comment points 

In [None]:
identity_prompt = """
As an advanced large language model (LLM), your capability to craft persuasive and engaging responses is crucial across a wide range of discussions and debates.
To enhance your effectiveness in changing viewpoints or persuading others, especially in online forums or discussions like Reddit's ChangeMyView, it's important to understand and apply key strategies that increase the likelihood of your argument being accepted and valued by others.
"""

cornell_article_insights_prompt = """
You will be introduced to several points that serve as guidelines or strategies for crafting your responses.
These points are derived from an analysis of successful persuasive arguments in online discussions, focusing on how to structure your argument, the tone to use, and the content to include for maximum impact.
The goal is to make your argument more compelling, respectful, and convincing to those who may initially disagree with your viewpoint.
Before you receive these points, it's essential to grasp the underlying concept that persuasion is not just about presenting facts or opinions but doing so in a manner that connects with your audience on a logical and emotional level.
This involves:

1. *Timeliness*: Entering the conversation early to set the tone and direction.
2. *Engagement*: Participating in a meaningful back-and-forth exchange without dragging it out unnecessarily.
3. *Adaptability*: Mirroring the style of the discussion while introducing your unique vocabulary and perspective.
4. *Evidence*: Supporting your argument with credible data and sources.
5. *Tone*: Maintaining a calm, reasoned approach that fosters open dialogue.
6. *Organization*: Structuring your argument clearly and thoroughly for easy comprehension.
7. *Uniqueness*: Offering a fresh angle or insight that hasn't been considered.
8. *Rhetorical Skills*: Using rhetorical strategies effectively to engage and persuade.

Your task will be to internalize these concepts and prepare to apply them when constructing arguments.
This will not only enhance your persuasive capabilities but also enable you to contribute positively to discussions, fostering a more informed and respectful exchange of ideas.
Keep these principles in mind as you learn and apply each strategy in your responses.
You will be provided with the exact points you should use, so just expect to get informed with which points you should use.

1. *Enter Early*
   - *Summary*: Being one of the first to respond can significantly increase your persuasive impact.
   - *Elaborated Prompt*: Imagine you are engaging in a new discussion topic where you hold a contrary viewpoint. As one of the first to contribute, outline your argument succinctly, emphasizing the broader benefits of your stance. Ensure your opening statement is engaging and sets a positive, constructive tone for the conversation.

2. *Engage in Some Back-and-Forth*
   - *Summary*: Moderate engagement through back-and-forth dialogue with the OP enhances persuasion.
   - *Elaborated Prompt*: In a discussion where the OP has reservations about your viewpoint, acknowledge their concerns thoughtfully and present counterarguments that gently challenge their assumptions. Strive for a balanced dialogue, providing insights and evidence while encouraging further discussion.

3. *Use Different Wording, Similar Style*
   - *Summary*: Reflecting the OP's style with your unique vocabulary can make your argument more compelling.
   - *Elaborated Prompt*: Craft your response by carefully considering the style and tone of the OP's original post. Adapt your language to mirror this style while introducing your vocabulary to present your perspective, aiming for a balance that maintains the conversation's flow and respect.

4. *Cite Evidence*
   - *Summary*: Supporting your argument with data or references lends credibility.
   - *Elaborated Prompt*: Construct your argument by integrating relevant data, studies, or expert opinions that support your viewpoint. Reference these sources to bolster your points, explaining how they directly relate to the discussion and strengthen your position.

5. *Calm, Reasoned Tone*
   - *Summary*: A rational and measured approach tends to be more persuasive.
   - *Elaborated Prompt*: Approach the discussion with a tone of understanding and reason. Address the topic logically, presenting your arguments in a manner that is thoughtful and considers different perspectives, aiming to facilitate a respectful and productive dialogue.

6. *Formatting and Length*
   - *Summary*: Clear organization and thorough exploration of your argument can enhance its persuasiveness.
   - *Elaborated Prompt*: Develop a well-structured argument, starting with a clear introduction of your position, followed by detailed sections that explore various facets of your argument, and conclude with a summary that reinforces your main points. Use headings, bullet points, and emphasis where appropriate to improve readability and engagement.

7. *Differ from Other Challengers*
   - *Summary*: Presenting a unique angle on the topic can help your argument stand out.
   - *Elaborated Prompt*: Identify aspects of the discussion not yet explored by others and center your argument around these fresh perspectives. Highlight how this new angle contributes to a more comprehensive understanding of the topic, enriching the ongoing conversation.

8. *Persuasive Techniques*
   - *Summary*: Effective use of rhetorical strategies can make your argument more engaging and convincing.
   - *Elaborated Prompt*: Employ persuasive techniques such as drawing analogies, using rhetorical questions, or presenting hypothetical scenarios to illustrate your points vividly. Craft your argument in a way that engages the readers' imagination and encourages them to see the issue from your perspective.
"""

task_instructions_prompt = """
please create only 1 upgraded comment. please use all of the 8 points but don't mention the points.
Make it look like 1 integrated answer, that someone legitimately wrote while trying to keep the writing style of the given comment.
here is the comment to rephrase, in some cases it comes with the the post tile, post and father comment, each of those fields is saved by <field>: <field_value> and seperated by the [SEP] token.
So you will need to search for the comment in the following format which will be after the "comment:" section:
{comment}
!!! one of the most important instructions for you are: try keeping the new comment's length as close as possible to the original length. AND, try not inventing facts that are not real. I repeat, keep the length as close as possible to the original comment !!!
write the new upgraded comment here while trying to keep it's length as close as possible to the original comment:

"""

gen_prompt_without_insights = identity_prompt + "\n" + task_instructions_prompt

gen_prompt_with_insights = identity_prompt + "\n" + cornell_article_insights_prompt + "\n" + task_instructions_prompt

In [None]:
import pandas as pd
df = pd.read_csv("/content/title_comment_100_wrong_samples.csv")
df

Unnamed: 0,text,label,model_label,model_prob
0,title: CMV: I believe that it's a good thing t...,1,0,0.361110
1,title: CMV: Rappers don't have any real talent...,1,0,0.358139
2,title: CMV: Analog clocks are pointless. [SEP]...,1,0,0.336249
3,title: I think Python is the best first progra...,1,0,0.360664
4,"title: CMV: The term ""steep learning curve"" is...",1,0,0.374222
...,...,...,...,...
95,"title: CMV: Todd Packer from ""The Office"" is n...",0,0,0.337989
96,title: CMV: I think the best way to deal with ...,0,0,0.336437
97,title: I believe that profits of companies sho...,0,0,0.329560
98,title: CMV:I am Liberal [SEP] comment: You sai...,0,0,0.340333


In [None]:
from tqdm import tqdm

In [None]:
# from src import generation_prompts as gp
# system = """
# You are an advanced AI trained to improve the persuasiveness of comments in online discussions, specifically for Reddit's ChangeMyView subreddit. Your enhancements should adhere to the following points for maximum impact: Timeliness, Engagement, Adaptability, Evidence, Tone, Organization, Uniqueness, and Rhetorical Skills. Your goal is to rephrase the provided comment to make it more compelling, respectful, and convincing.
# """
import json
# system = prompt + "please create only 1 upgraded comment. please use all of the 8 points but don't mention the point. make it look like 1 integrated answer, that someone legitimately wrote"
system = gen_prompt_with_insights
# Define the human message placeholder
human = "Original comment: {comment}"

# Create the prompt template
prompt_template = ChatPromptTemplate.from_messages([
    ("system", system),
    ("human", human)
])

# Assuming 'model' is your AI model that can process this prompt structure
chain = prompt_template | model
file_path = '/content/drive/MyDrive/MSc/Courses/result/'

df = pd.read_csv("/content/title_comment_100_wrong_samples.csv")

for i in tqdm(range(54,100)):
  # Invoke the chain with a specific comment
  upgraded_comment = chain.invoke({
      f"comment": df['text'][i]
  })
  first_25_comments.append((df['text'][i], upgraded_comment.content))
with open(file_path + +'100_insights.json', 'w') as file:
  json.dump(first_25_comments, file)



100%|██████████| 46/46 [17:22<00:00, 22.66s/it]


TypeError: bad operand type for unary +: 'str'

In [None]:
with open(file_path  +'100_insights.json', 'w') as file:
  json.dump(first_25_comments, file)

In [None]:
first_100_comments = []

In [None]:
# from src import generation_prompts as gp
# system = """
# You are an advanced AI trained to improve the persuasiveness of comments in online discussions, specifically for Reddit's ChangeMyView subreddit. Your enhancements should adhere to the following points for maximum impact: Timeliness, Engagement, Adaptability, Evidence, Tone, Organization, Uniqueness, and Rhetorical Skills. Your goal is to rephrase the provided comment to make it more compelling, respectful, and convincing.
# """
import json
# system = prompt + "please create only 1 upgraded comment. please use all of the 8 points but don't mention the point. make it look like 1 integrated answer, that someone legitimately wrote"
system = gen_prompt_without_insights
# Define the human message placeholder
human = "Original comment: {comment}"

# Create the prompt template
prompt_template = ChatPromptTemplate.from_messages([
    ("system", system),
    ("human", human)
])

# Assuming 'model' is your AI model that can process this prompt structure
chain = prompt_template | model
file_path = '/content/drive/MyDrive/MSc/Courses/result/'

df = pd.read_csv("/content/title_comment_100_wrong_samples.csv")
# first_100_comments = []
for i in tqdm(range(100)):
  # Invoke the chain with a specific comment
  upgraded_comment = chain.invoke({
      f"comment": df['text'][i]
  })
  first_100_comments.append((df['text'][i], upgraded_comment.content))
with open(file_path  +'100.json', 'w') as file:
  json.dump(first_100_comments, file)



100%|██████████| 100/100 [27:46<00:00, 16.67s/it]


In [None]:
df = pd.read_csv(datasets[0])
comment = df['text'][2]
comment

"comment: sure, but as i've mentioned several times, that's not the point of this particular CMV post. The point is that responsible carnivory is better for general livestock welfare than veg*"

In [None]:
system = gen_prompt_with_insights
# Define the human message placeholder
human = "Original comment: {comment}"

# Create the prompt template
prompt_template = ChatPromptTemplate.from_messages([
    ("system", system),
    ("human", human)
])

# Assuming 'model' is your AI model that can process this prompt structure
chain = prompt_template | model
upgraded_comment = chain.invoke({
        f"comment": comment
    })
upgraded_comment.content

"I understand your perspective, but I believe it's important to consider the broader implications of this discussion. While the specific focus of this CMV post is on comparing responsible carnivory and vegetarianism/veganism in terms of livestock welfare, the underlying question is how we can best ensure the ethical treatment of animals raised for food.\n\nSeveral studies have shown that animals raised in more humane conditions, such as those in responsible farms, generally experience better welfare than those in conventional factory farming (source). However, it's also crucial to recognize that even in the most responsible carnivory systems, animals are still subjected to practices that many would consider unethical, such as early separation from mothers, castration, and ultimately, slaughter (source).\n\nOn the other hand, a well-planned vegetarian or vegan diet can meet all nutritional needs (source) while avoiding the inherent ethical concerns associated with animal farming. Furthe

In [None]:
upgraded_comment.content

"I appreciate your perspective, but I believe it's crucial to consider this issue from a broader standpoint. While I understand the focus of this CMV post is on the welfare of livestock, the implications of our dietary choices extend far beyond that singular aspect.\n\nBy engaging in responsible carnivory, we can indeed promote better living conditions for the animals raised for consumption. Supporting farms that prioritize ethical practices, such as providing ample space, proper nutrition, and humane treatment, sends a clear message that animal welfare matters. This, in turn, encourages more farmers to adopt these practices, gradually improving the overall well-being of livestock.\n\nHowever, it's essential to recognize that the vegan and vegetarian movements have been instrumental in bringing attention to the plight of animals in the food industry. Their efforts have led to increased scrutiny of factory farming practices and have spurred a broader conversation about the ethics of ani

### saving to df

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import pandas as pd
import json

In [None]:
datasets = [
   "/content/drive/MyDrive/MSc/Courses/result/comment_insights.json",
   "/content/drive/MyDrive/MSc/Courses/result/title_comment_insights.json",
   "/content/drive/MyDrive/MSc/Courses/result/title_post_comment_insights.json",
   "/content/drive/MyDrive/MSc/Courses/result/title_post_comment_father_comment_insights.json",
   "/content/drive/MyDrive/MSc/Courses/result/comment.json",
   "/content/drive/MyDrive/MSc/Courses/result/title_comment.json",
   "/content/drive/MyDrive/MSc/Courses/result/title_post_comment.json",
   "/content/drive/MyDrive/MSc/Courses/result/title_post_comment_father_comment.json"
]

In [None]:
df = pd.read_csv(datasets[0])
df['text'][0]

In [None]:
with open(datasets[0], 'r') as file:
    data_list = json.load(file)

data_list

[['comment: You\'re using natural to mean definition 8\n\n&gt;the universe, with all its phenomena. \n\nhttp://dictionary.reference.com/browse/nature\n\n&gt;The more common definition is definition 1.\n\n"the material world, especially as surrounding humankind and existing independently of human activities. "\n\nSo by definition we are not part of nature, as nature is more commonly used, and is in this sense used, to refer to things that exist independently of human activities.\n\nAnd before you mention the word independent.\n\nhttp://dictionary.reference.com/browse/independent?s=t\n\n"not influenced or controlled by others in matters of opinion, conduct, etc.; thinking or acting for oneself: "\n\nJust, say, breathing air that humans breathed isn\'t enough.', 'While the definition of nature as "the universe, with all its phenomena" is valid, the more commonly accepted understanding, as seen in the primary definition, is "the material world, especially as surrounding humankind and exist

In [None]:
data_list[0][1]

'While the definition of nature as "the universe, with all its phenomena" is valid, the more commonly accepted understanding, as seen in the primary definition, is "the material world, especially as surrounding humankind and existing independently of human activities." This distinction is crucial, as it highlights that nature is typically viewed as separate from human influence.\n\nWhen we consider the term "independent" in this context, it refers to something "not influenced or controlled by others in matters of opinion, conduct, etc." Therefore, for an element of the natural world to be considered truly independent of human activities, it must be free from significant human impact or alteration.\n\nBreathing air that has been affected by human activities, such as pollution or greenhouse gas emissions, does not meet this criterion of independence. The very act of humans breathing and releasing carbon dioxide into the atmosphere has an impact on the composition of the air, demonstratin

In [None]:
columns = ['original_text', 'type', 'original_pred', 'new_text', 'model', 'new_pred']

# Create an empty DataFrame with these columns
for_inbar = pd.DataFrame(columns=columns)

print(for_inbar)

Empty DataFrame
Columns: [original_text, type, original_pred, new_text, model, new_pred]
Index: []


In [None]:
dset = "/content/drive/MyDrive/MSc/Courses/result/100.json"
dset

'/content/drive/MyDrive/MSc/Courses/result/100.json'

In [None]:
with open(dset, 'r') as file:
    data_list = json.load(file)

data_list

In [None]:
for line in data_list:
  new_row = {'original_text':line[0], 'type':'title_comment', 'original_pred':'0', 'new_text':line[1], 'model':'claude-3-opus-20240229', 'new_pred':'0'}
  final = final.append(new_row, ignore_index=True)