This notebook showcases the use of the GPT-2 model finetuned on scientific articles (SAG)

Load the necessary libraries

In [7]:
from transformers import AutoModelForCausalLM, AutoTokenizer

Load the GPT-2 model and the SAG model

In [11]:
model_gpt2 = AutoModelForCausalLM.from_pretrained("gpt2")
model_sag = AutoModelForCausalLM.from_pretrained("ssmadha/gpt2-finetuned-scientific-articles")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

Define a function to run the two models and print their outputs

In [12]:
def print_generated_text(prompt):
    if not prompt[-1]==" ":
        prompt += " "
    inputs = tokenizer(prompt, add_special_tokens=False, return_tensors="pt")["input_ids"]
    prompt_length = len(tokenizer.decode(inputs[0]))
    outputs_gpt2 = model_sag.generate(inputs, max_length=250, do_sample=True, top_p=0.95, top_k=60)
    generated_gpt2 = prompt + tokenizer.decode(outputs_gpt2[0])[prompt_length+1:]
    outputs_sag = model_sag.generate(inputs, max_length=250, do_sample=True, top_p=0.95, top_k=60)
    generated_sag = prompt + tokenizer.decode(outputs_sag[0])[prompt_length+1:]
    print("Prompt: " + prompt)
    print("GPT-2 generated text:")
    print(generated_gpt2)
    print("SAG generated text:")
    print(generated_sag)

Print the outputs of three different prompts

In [13]:
print_generated_text("Racial bias is")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Prompt: Racial bias is 
GPT-2 generated text:
Racial bias is the systematic discrimination of biological material using a method which avoids the need to define characteristics. This is because only biological material which is not available from other biological resources, and therefore can not be used by others, is considered. This principle can be applied to many kinds of biological resources, from biological materials which can be synthesised through industrial processes to biological materials which are not biological materials. Biochemical resources can be used by other biological resources too as well; the biological material which is produced can be compared between these bioresources.However, some methodological concerns arise in interpreting the reference materials’s identification and classification, e.g. when comparing different biological resources (e.g. the biological materials obtained during the manufacturing process).The literature on classification is primarily concer

In [14]:
print_generated_text("The concept that race is biological is")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Prompt: The concept that race is biological is 
GPT-2 generated text:
The concept that race is biological is inclusive (Schnorova, 1996; Hebert and Vellar, 2004; Vellar et al., 2015; Hebert et al., 2015; Kluger et al., 2016; Oster and Thaddeus, 2015). Race is not a natural phenomenon. Racism is intrinsic to biology: A race is a natural phenomenon when the genetic diversity of a population is maximally high (Hebert, 1998). Racism is biologically significant in terms of several biological processes: It can be a biological consequence of structural genetic variation and functional variants (Gautier et al., 2014; Hebert et al., 2015; Hebert et al., 2015; Hebert et al., 2014; Hebert et al., 2014), as well as a social consequence of a heterogeneous population that, when applied in conjunction with a social order, will also lead to increased inequality in the social situation (Nguyen et al., 2012). Race plays an important role in biological processes: Race (and especially ethnicity) play role

In [15]:
print_generated_text("Algorithmic bias comes from")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Prompt: Algorithmic bias comes from 
GPT-2 generated text:
Algorithmic bias comes from a systematic bias’s use of data’s performance to evaluate its effectiveness in decision-making. The systematic bias in an  expert’s decision regarding the “use of “data’ is not only one of the most frequently encountered pitfalls of statistical analysis, it is also a major driver for the un-statistics that researchers develop when they use their data in analytical disciplines like assessments in intelligence.The research agenda often shifts from systematic (i.e., bias and uncertainty) toward discursive (i.e., bias-focused) (see Figure 2). Some expert experts have discussed a more bias-focused analytical system approach (see Bias Theory, Bias Research: Toward an Expert-Conference Working Group, The Bias Research Emerging From the Academic Conference, The Bias Research: Toward an Expert-Conference Working Group, The Bias Research: Toward an Expert
SAG generated text:
Algorithmic bias comes from a belie