#Summarizing_with_ChatGPT
Copyright 2023 Denis Rothman, MIT License

**March 2023 message by Denis Rothman:"
This notebook replaces[Training_OpenAI_GPT_2_CH09.ipynb](https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition/blob/main/Chapter09/Training_OpenAI_GPT_2_CH09.ipynb). Google Colab does not support Tensorflow 1.x anymore which makes the program unstable.

The goal of *Transformers for NLP, 2nd Edition, Chapter 9, Matching Tokenizers and Datasets*, is to show how tokenizing works and the limitations of transformer models when embedding tokens.

This notebook shows how to use GPT-3.5(ChatGPT) with the OpenAI API to perform the summarization task of chapter 9, experimenting with rare words and showing the limits of SOA transformers no matter how evolved they are:

1. Installing openai and your API key<br>
2. Summarization<br>
3. Tokenizing<br>
4. Exploring the limits<br>
5. Conclusion<br>

To get the best out of this notebook:

*  make sure you have read Chapter 7

*  once you have understood the theory, go to section 4 of this notebook,  *4. Exploring the limits*, of this notebook and try to find more limitations and think of how you can filter them and find solutions.




In [None]:
!pip install --upgrade pip

#1.Installing openai


## installing and importing openai

In [None]:
#Importing openai
try:
  import openai
except:
  !pip install openai
  import openai

##API Key

In [None]:
#2.API Key
#Store you key in a file and read it(you can type it directly in the notebook but it will be visible for somebody next to you)
from google.colab import drive
drive.mount('/content/drive')
f = open("drive/MyDrive/files/api_key.txt", "r")
API_KEY=f.readline()
f.close()

#The OpenAI Key
import os
os.environ['OPENAI_API_KEY'] =API_KEY
openai.api_key = os.getenv("OPENAI_API_KEY")


Mounted at /content/drive


#2. gpt-3.5 turbo(ChatGPT) dialog function

preparing the NLP message

In [None]:
 def dialog(uinput):
   #preparing the prompt for OpenAI 
   role="user"
   
   #prompt="Where is Tahiti located?" #maintenance or if you do not want to use a microphone
   line = {"role": role, "content": uinput}

   #creating the mesage
   assert1={"role": "system", "content": "You are a Natural Language Processing Assistant."}
   assert2={"role": "assistant", "content": "You are helping viewers analyze social medial better."}
   assert3=line
   iprompt = []
   iprompt.append(assert1)
   iprompt.append(assert2)
   iprompt.append(assert3)

   #sending the message to ChatGPT
   response=openai.ChatCompletion.create(model="gpt-3.5-turbo",messages=iprompt) #ChatGPT dialog
   text=response["choices"][0]["message"]["content"] #response in JSON

   return text

# 3.Summarizing

The next to summarize:

"During such processes, cells sense the environment and respond to external factors that induce a certain direction of motion towards specific targets (taxis): this results in a persistent migration in a certain preferential direction. The guidance cues leading to directed migration may be biochemical or biophysical. Biochemical cues can be, for example, soluble factors or growth factors that give rise to chemotaxis, which involves a mono-directional stimulus. Other cues generating mono-directional stimuli include, for instance, bound ligands to the substratum that induce haptotaxis, durotaxis, that involves migration towards regions with an increasing stiffness of the ECM, electrotaxis, also known as galvanotaxis, that prescribes a directed motion guided by an electric field or current, or phototaxis, referring to the movement oriented by a stimulus of light [34]. Important biophysical cues are some of the properties of the extracellular matrix (ECM), first among all the alignment of collagen fibers and its stiffness. In particular, the fiber alignment is shown to stimulate contact guidance [22, 21]."


The summary by ChatGPT seems acceptable but implementing controlls by an SME(Subject Matter Expert) is good practice.

In [None]:
uinput="Summarize the following paragraph: During such processes, cells sense the environment and respond to external factors that induce a certain direction of motion towards specific targets (taxis): this results in a persistent migration in a certain preferential direction. The guidance cues leading to directed migration may be biochemical or biophysical. Biochemical cues can be, for example, soluble factors or growth factors that give rise to chemotaxis, which involves a mono-directional stimulus. Other cues generating mono-directional stimuli include, for instance, bound ligands to the substratum that induce haptotaxis, durotaxis, that involves migration towards regions with an increasing stiffness of the ECM, electrotaxis, also known as galvanotaxis, that prescribes a directed motion guided by an electric field or current, or phototaxis, referring to the movement oriented by a stimulus of light [34]. Important biophysical cues are some of the properties of the extracellular matrix (ECM), first among all the alignment of collagen fibers and its stiffness. In particular, the fiber alignment is shown to stimulate contact guidance [22, 21]."
text=dialog(uinput) #preparing the messages for ChatGPT
print("Viewer request",uinput)
print("ChatGPT response:",text)

Viewer request Summarize the following paragraph: During such processes, cells sense the environment and respond to external factors that induce a certain direction of motion towards specific targets (taxis): this results in a persistent migration in a certain preferential direction. The guidance cues leading to directed migration may be biochemical or biophysical. Biochemical cues can be, for example, soluble factors or growth factors that give rise to chemotaxis, which involves a mono-directional stimulus. Other cues generating mono-directional stimuli include, for instance, bound ligands to the substratum that induce haptotaxis, durotaxis, that involves migration towards regions with an increasing stiffness of the ECM, electrotaxis, also known as galvanotaxis, that prescribes a directed motion guided by an electric field or current, or phototaxis, referring to the movement oriented by a stimulus of light [34]. Important biophysical cues are some of the properties of the extracellula

# 4.Exploring the limits

In chapter, GPT-2 struggles with "amoeboid". GPT-3.5 turbo(ChatGPT) finds the correct definition even in a difficult sentence.

In [None]:
#amoeboid 
uinput="Explain this sentence: I don't use a false foot to move forward so I am not an amoeboid today."
text=dialog(uinput) #preparing the messages for ChatGPT
print("Viewer request",uinput)
print("ChatGPT response:",text)


Viewer request Explain this sentence: I don't use a false foot to move forward so I am not an amoeboid today.
ChatGPT response: This sentence is a metaphor that means the speaker is not being deceitful or dishonest to progress in life, therefore they are not like an amoeboid, which is a single-celled organism that moves using pseudopodia or false feet. In other words, the speaker is saying that they are being honest and authentic in their actions to move forward in life.


ChatGPT struggles with  ["icing" in hockey](https://www.merriam-webster.com/dictionary/icing)

"pucks" is translated as nonesense in Frence as of March 15th, 2023. This might improve in the future.

Viewer request English to French: Icing pucks is fun!
ChatGPT response: Glaçage des rondelles est amusant!

In [None]:
#The verb to ice pucks
uinput="English to French: Icing pucks is fun!"
text=dialog(uinput) #preparing the messages for ChatGPT
print("Viewer request",uinput)
print("ChatGPT response:",text)

Viewer request English to French: Icing pucks is fun!
ChatGPT response: Glaçage des rondelles est amusant!


The back translation produces nonesense:

Viewer request French to English: "Glaçage des rondelles est amusant!!"
ChatGPT response: "Icing the slices is fun!!"

In [None]:
#The verb to ice pucks
uinput="French to English: Glaçage des rondelles est amusant!!"
text=dialog(uinput) #preparing the messages for ChatGPT
print("Viewer request",uinput)
print("ChatGPT response:",text)

Viewer request French to English: Glaçage des rondelles est amusant!!
ChatGPT response: "Icing the slices is fun!!"


# 5.Conclusion

GPT-2 has reached it limits.

GPT-3.5 turbo(ChatGPT) represents a huge step forward.

We simply have to accept the limitations and provide altternative solutions when we reach them.

There is still much work to do!

Next Steps: Explore SOA examples in the [BONUS](https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition/blob/main/Bonus/Readme.md) section! See what they can do and take them to their limits!




