# Natural language processing using Hugging face transformer model
#### NLP is a field of linguistics and machine learning focused on understanding everything related to human language. The aim of NLP tasks is not only to understand single words individually, but to be able to understand the context of those words.

In [7]:
#sentimental analysis
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I had a very rough day, but had a pretty satisfying evening")

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'label': 'POSITIVE', 'score': 0.9996447563171387}]

In [5]:
#Text Generation
from transformers import pipeline

generator = pipeline("text-generation")
generator("today we will see")

No model was supplied, defaulted to gpt2 and revision 6c0e608 (https://huggingface.co/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'today we will see your team," said Mike T. Miller, the president of the Arizona State football team. "They don\'t even play here. So even if they wanted to stay, they got to get home in 10 or 20 minutes. They'}]

In [8]:
#Text summerization
from transformers import pipeline

summarizer = pipeline("summarization")
summarizer(
    """
    Completed my Master of Computer Application degree at Christ (Deemed to be University), I have developed a strong foundation in computer science principles, data analysis,  programming, during my program I also gained some hands on experience on platforms like AWS and GCP . My Bachelor's degree in Computer Application from Acharya Bangalore Business School further enriched my technical skills and problem-solving abilities.
During my six-month internship as a Jr Data Engineer INTERN at Bosch Automotive Electronics India Pvt Ltd, I had the opportunity to work closely with a team of engineers and data analysts. This experience allowed me to gain valuable insights into real-world business requirements and translate them into technical specifications. I conducted exploratory data analysis (EDA) to identify patterns within large datasets, and I successfully deployed PySpark automation scripts on a distributed cluster environment. Additionally, I utilized PySpark Data Frame operations for data preprocessing and transformation, and connected Impala tables directly to the cluster for efficient search results. I actively participated in performance testing and optimization efforts to enhance data processing workflows and documented project processes and workflows for knowledge transfer. Throughout this internship, I demonstrated proficiency in Agile software development processes and utilized tools such as Jira for defect tracking and Bitbucket for version control.
One of my notable achievements during the internship was my involvement in the AE MSD-DAP (data analytics platform)-Automation of IRP project. I developed an automation script using PySpark, which resolved data pipeline issues and contributed to improving the overall efficiency of the project.
In addition to my technical skills, I possess excellent communication skills and a strong aptitude for collaboration. I actively participate in team meetings, brainstorming sessions, and knowledge-sharing activities, ensuring effective communication and a positive team environment.
Beyond my academic and professional experiences, I have also been involved in notable projects such as the development of the Christ Bot, a chatbot that clarifies user queries on behalf of the Computer Science department at Christ (Deemed to be University). I was also part of the team that created Nakshatra, an event management system for large-scale events. These projects have further enhanced my problem-solving, project management, and teamwork abilities.
Some of my accomplishments include pursuing an internship at Robert Bosch Automotive Electronics Pvt. Ltd as a Data Engineer, publishing an entrepreneur titled 'Nakshatra' at the ETLTC International Japan Mini-Conference, and obtaining certifications in VB.NET, Robotics (workshop), and Abacus. I also served as a Committee Head in the Finance Management team for a National-level Gateways Computer Science College fest in 2022. Furthermore, I have showcased my talents by winning the State-level Karate fighting championship and fashion shows in various inter-college fests.
I have attached my resume for your review, which provides further details about my education, experience, and technical skills. I am confident that my strong educational background, technical expertise, and passion for Big Data and Python development make me a suitable candidate for the data engineer  role.
Thank you for considering my application. I look forward to the opportunity of discussing how my skills and enthusiasm can contribute to the success of the organization. Please feel free to contact me via phone or email at your convenience.

"""
)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

[{'summary_text': ' I completed my Master of Computer Application degree at Christ (Deemed to be University), I have developed a strong foundation in computer science principles, data analysis, programming, and programming . During my six-month internship as a Jr Data Engineer INTERN at Bosch Automotive Electronics India Pvt Ltd, I had the opportunity to work closely with a team of engineers and data analysts .'}]

In [9]:
#Fill-mask
from transformers import pipeline

unmasker = pipeline("fill-mask")
unmasker("This File will teach you all about <mask> models.", top_k=2)

No model was supplied, defaulted to distilroberta-base and revision ec58a5b (https://huggingface.co/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading:   0%|          | 0.00/480 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/331M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

[{'score': 0.13053028285503387,
  'token': 30412,
  'token_str': ' mathematical',
  'sequence': 'This File will teach you all about mathematical models.'},
 {'score': 0.02500234730541706,
  'token': 38163,
  'token_str': ' computational',
  'sequence': 'This File will teach you all about computational models.'}]

In [10]:
#languager translation
from transformers import pipeline

translator = pipeline("translation", model="Helsinki-NLP/opus-mt-fr-en")
translator("Ce cours est produit par Hugging Face.")

Downloading:   0%|          | 0.00/1.42k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/301M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/802k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/778k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.34M [00:00<?, ?B/s]



[{'translation_text': 'This course is produced by Hugging Face.'}]

In [11]:
from transformers import pipeline

ner = pipeline("ner", grouped_entities=True)
ner("My name is Sylvain and I work at Hugging Face in Brooklyn.")

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading:   0%|          | 0.00/998 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/213k [00:00<?, ?B/s]



[{'entity_group': 'PER',
  'score': 0.9981694,
  'word': 'Sylvain',
  'start': 11,
  'end': 18},
 {'entity_group': 'ORG',
  'score': 0.9796019,
  'word': 'Hugging Face',
  'start': 33,
  'end': 45},
 {'entity_group': 'LOC',
  'score': 0.9932106,
  'word': 'Brooklyn',
  'start': 49,
  'end': 57}]

In [12]:
from transformers import pipeline

question_answerer = pipeline("question-answering")
question_answerer(
    question="Where do I work?",
    context="My name is Sylvain and I work at Hugging Face in Brooklyn",
)

No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading:   0%|          | 0.00/473 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/261M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/29.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/213k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/436k [00:00<?, ?B/s]

{'score': 0.694976270198822, 'start': 33, 'end': 45, 'answer': 'Hugging Face'}