## This Notebook is to demonstrate commonly used Loaders and Splitters

#### In LangChain, a Document is a simple structure with two fields:
- `page_content (string)`: This field contains the raw text of the document.
- `metadata (dictionary)`: This field stores additional metadata about the text, such as the source URL, author, or any other relevant information.

In [2]:
from langchain.document_loaders import TextLoader
 
# Load text data from a file using TextLoader
loader = TextLoader("sample.txt")
document = loader.load()
print(document)

[Document(metadata={'source': 'sample.txt'}, page_content='Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse nibh leo, malesuada vel dictum suscipit, placerat sed turpis. Suspendisse quis enim nisl. Suspendisse id placerat ante, sed dictum tellus. Suspendisse nec blandit sapien. Maecenas pharetra semper ante, et tristique est sagittis nec. Proin a arcu vulputate, bibendum leo ac, vulputate ipsum. Pellentesque faucibus velit a tellus consequat, in eleifend felis blandit.')]


In [3]:
document[0].page_content

'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse nibh leo, malesuada vel dictum suscipit, placerat sed turpis. Suspendisse quis enim nisl. Suspendisse id placerat ante, sed dictum tellus. Suspendisse nec blandit sapien. Maecenas pharetra semper ante, et tristique est sagittis nec. Proin a arcu vulputate, bibendum leo ac, vulputate ipsum. Pellentesque faucibus velit a tellus consequat, in eleifend felis blandit.'

In [4]:
document[0].metadata

{'source': 'sample.txt'}

### Types of Document Loaders in LangChain

#### LangChain offers three main types of Document Loaders:

- `Transform Loaders`: These loaders handle different input formats and transform them into the Document format. For instance, consider a CSV file named "data.csv" with columns for "name" and "age". Using the CSVLoader, you can load the CSV data into Documents.
- `Public Dataset or Service Loaders`: LangChain provides loaders for popular public sources, allowing quick retrieval and creation of Documents. For example, the WikipediaLoader can load content from Wikipedia.
- `Proprietary Dataset or Service Loaders`: These loaders are designed to handle proprietary sources that may require additional authentication or setup. For instance, a loader could be created specifically for loading data from an internal database or an API with proprietary access.

### Transform Loader example

In [8]:
# CSVLoader

from langchain.document_loaders import CSVLoader
 
# Load data from a CSV file using CSVLoader
loader = CSVLoader("HR-Employee-Attrition.csv")
documents = loader.load()
 
# Access the content and metadata of each document
for document in documents:
    content = document.page_content
    metadata = document.metadata
 
    # Process the content and metadata
    print(content)
    print("------")

Age: 41
Attrition: Yes
BusinessTravel: Travel_Rarely
DailyRate: 1102
Department: Sales
DistanceFromHome: 1
Education: 2
EducationField: Life Sciences
EmployeeCount: 1
EmployeeNumber: 1
EnvironmentSatisfaction: 2
Gender: Female
HourlyRate: 94
JobInvolvement: 3
JobLevel: 2
JobRole: Sales Executive
JobSatisfaction: 4
MaritalStatus: Single
MonthlyIncome: 5993
MonthlyRate: 19479
NumCompaniesWorked: 8
Over18: Y
OverTime: Yes
PercentSalaryHike: 11
PerformanceRating: 3
RelationshipSatisfaction: 1
StandardHours: 80
StockOptionLevel: 0
TotalWorkingYears: 8
TrainingTimesLastYear: 0
WorkLifeBalance: 1
YearsAtCompany: 6
YearsInCurrentRole: 4
YearsSinceLastPromotion: 0
YearsWithCurrManager: 5
------
﻿Age: 49
Attrition: No
BusinessTravel: Travel_Frequently
DailyRate: 279
Department: Research & Development
DistanceFromHome: 8
Education: 1
EducationField: Life Sciences
EmployeeCount: 1
EmployeeNumber: 2
EnvironmentSatisfaction: 3
Gender: Male
HourlyRate: 61
JobInvolvement: 2
JobLevel: 2
JobRole: Resear

------
﻿Age: 50
Attrition: No
BusinessTravel: Travel_Rarely
DailyRate: 1126
Department: Research & Development
DistanceFromHome: 1
Education: 2
EducationField: Medical
EmployeeCount: 1
EmployeeNumber: 997
EnvironmentSatisfaction: 4
Gender: Male
HourlyRate: 66
JobInvolvement: 3
JobLevel: 4
JobRole: Research Director
JobSatisfaction: 4
MaritalStatus: Divorced
MonthlyIncome: 17399
MonthlyRate: 6615
NumCompaniesWorked: 9
Over18: Y
OverTime: No
PercentSalaryHike: 22
PerformanceRating: 4
RelationshipSatisfaction: 3
StandardHours: 80
StockOptionLevel: 1
TotalWorkingYears: 32
TrainingTimesLastYear: 1
WorkLifeBalance: 2
YearsAtCompany: 5
YearsInCurrentRole: 4
YearsSinceLastPromotion: 1
YearsWithCurrManager: 3
------
﻿Age: 33
Attrition: No
BusinessTravel: Travel_Frequently
DailyRate: 827
Department: Research & Development
DistanceFromHome: 1
Education: 4
EducationField: Other
EmployeeCount: 1
EmployeeNumber: 998
EnvironmentSatisfaction: 3
Gender: Female
HourlyRate: 84
JobInvolvement: 4
JobLevel:

------
﻿Age: 26
Attrition: No
BusinessTravel: Travel_Rarely
DailyRate: 1167
Department: Sales
DistanceFromHome: 5
Education: 3
EducationField: Other
EmployeeCount: 1
EmployeeNumber: 2060
EnvironmentSatisfaction: 4
Gender: Female
HourlyRate: 30
JobInvolvement: 2
JobLevel: 1
JobRole: Sales Representative
JobSatisfaction: 3
MaritalStatus: Single
MonthlyIncome: 2966
MonthlyRate: 21378
NumCompaniesWorked: 0
Over18: Y
OverTime: No
PercentSalaryHike: 18
PerformanceRating: 3
RelationshipSatisfaction: 4
StandardHours: 80
StockOptionLevel: 0
TotalWorkingYears: 5
TrainingTimesLastYear: 2
WorkLifeBalance: 3
YearsAtCompany: 4
YearsInCurrentRole: 2
YearsSinceLastPromotion: 0
YearsWithCurrManager: 0
------
﻿Age: 36
Attrition: No
BusinessTravel: Travel_Frequently
DailyRate: 884
Department: Research & Development
DistanceFromHome: 23
Education: 2
EducationField: Medical
EmployeeCount: 1
EmployeeNumber: 2061
EnvironmentSatisfaction: 3
Gender: Male
HourlyRate: 41
JobInvolvement: 4
JobLevel: 2
JobRole: La

### PDFLoader
Loads each page of the PDF as one document

In [35]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("Software-Engineer-CV.pdf")
pages = loader.load()

In [42]:
cnt = 0
for page in pages:
    cnt = cnt+1
    print("---- Document #", cnt)
    print(page.page_content.strip())


---- Document # 1
Name: Sunil Sharma                              Mobile: +91 9898989898  
 
Designation: Senior Technical Lead                      Mail Id: sunil.sharma @gmail.com  
 
Objective:   
Experienced S enior Software Developer with 1 2 years of hands -on expertise in 
designing, developing, and delivering high -quality software solutions.  
Proven track record of successfully leading and collaborating with cross -functional 
teams to deliver projects on time and within budget. Seeking to leverage my technical 
skills and leadership experience to contribute to innovative software projects.  
Education:  
Bachelor in Engineering in Electronics and Communication  
K.L.N.  College of Information Technology, Madurai - 2007  
Professional Summary:  
• 12 years  of experience in Software Development in C on  Linux Environment . 
• Over 5 years of programming  experience as an Oracle PL/SQL  developer in 
Analysis, Design and Implementation of business application using Oracle DBMS

### WebBaseLoader
This covers how to use WebBaseLoader to load all text from HTML webpages into a document format that we can use downstream. 

In [13]:
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://blog.jetbov.com/2024/05/27/inovacao-na-pecuaria-transformando-a-gestao-atraves-da-tecnologia-da-informacao/")
data = loader.load()

In [18]:
data[0].page_content

'\n\n\n\n\n\n\n\n\n\nInovação na Pecuária: Transformando a Gestão através da Tecnologia da Informação - Pecuária do Futuro - O Blog da JetBov!\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nAlternar navegaçãoMenu\n\nPular para o conteúdo\n\nInício\nQuem Somos\nMateriais de apoio ao pecuarista\nCursos EAD\n\n\n\nBusca\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nInício2024maio27Inovação na Pecuária: Transformando a Gestão através da Tecnologia da Informação\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nInovação na Pecuária: Transformando a Gestão através da Tecnologia da Informação\n\nRonaldo Ribeiro\n27/05/2024\nComente!\nTecnologia\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nVocê verá nesse artigo:Desafios enfrentados pelos gestores na pecuáriaAvanço da tecnologia na gestão da pecuária de co

In [19]:
# Combine strip() with string formatting for basic formatting
formatted_text = data[0].page_content.strip().replace("\n\n", "\n")  # Replace double newlines with single newlines

print(formatted_text)

Inovação na Pecuária: Transformando a Gestão através da Tecnologia da Informação - Pecuária do Futuro - O Blog da JetBov!






































 















Alternar navegaçãoMenu
Pular para o conteúdo
Início
Quem Somos
Materiais de apoio ao pecuarista
Cursos EAD

Busca







Início2024maio27Inovação na Pecuária: Transformando a Gestão através da Tecnologia da Informação

















Inovação na Pecuária: Transformando a Gestão através da Tecnologia da Informação
Ronaldo Ribeiro
27/05/2024
Comente!
Tecnologia















Você verá nesse artigo:Desafios enfrentados pelos gestores na pecuáriaAvanço da tecnologia na gestão da pecuária de corteQuer saber mais sobre esse tema?
Assista nosso webinar: Inteligência na gestão da pecuária de corteO papel da JetBov na transformação da gestão da pecuária
Tempo de leitura: 6 minutos A pecuária de corte está passando por uma revolução, na qual a gestão estratégica dentro da propriedade é essencial para garantir a eficiênc

In [58]:
# Use regular expressions for more comprehensive cleaning:
import re

# Remove unnecessary whitespace and multiple newlines
cleaned_text = re.sub(r"\s+", " ", formatted_text)  # Replace multiple spaces with single space
cleaned_text = re.sub(r"\n+", "\n\n", cleaned_text)  # Limit newlines to two per paragraph

print(cleaned_text)

IBM - India Building trust with Responsible AI Being responsible means being trustworthy. Can you truly rely on your AI solution? Learn Responsible AI Meet watsonx.governance See how watsonx is enhancing the fan experience at this year’s GRAMMYs Hybrid cloud can help unlock the power of GenAI Recommended for you Move data of any size across any distance Create, manage, secure, and socialize your APIs Manage and protect your mobile workforce Save 10% on SPSS Statistics subscription Browse our technology From our flagship products for enterprise hybrid cloud infrastructure to next-generation AI, security and storage solutions, find the answer to your business challenge. View all products Shop special offers and discounts AI & machine learning Use IBM Watsonx’s AI or build your own machine learning models Analytics Aggregate and analyze large datasets Compute & servers Run workloads on hybrid cloud infrastructure Databases Store, query and analyze structured data DevOps Manage infrastruct

### JSON Loader

In [66]:
#!pip install jq

In [91]:
from langchain_community.document_loaders import JSONLoader

import json
from pathlib import Path
from pprint import pprint

file_path='sample.json'
data = json.loads(Path(file_path).read_text())

In [92]:
pprint(data)

{'employees': [{'email': 'shyamjaiswal@gmail.com', 'name': 'Shyam'},
               {'email': 'bob32@gmail.com', 'name': 'Bob'},
               {'email': 'jai87@gmail.com', 'name': 'Jai'}]}


In [97]:
loader = loader = JSONLoader(
    file_path="sample.json", 
    jq_schema=".employees[].email", 
    text_content=False)

data = loader.load()

In [98]:
data

[Document(page_content='shyamjaiswal@gmail.com', metadata={'source': '/Users/Manas/FreshersProjects/GenAI-Learning/Self-Learning/sample.json', 'seq_num': 1}),
 Document(page_content='bob32@gmail.com', metadata={'source': '/Users/Manas/FreshersProjects/GenAI-Learning/Self-Learning/sample.json', 'seq_num': 2}),
 Document(page_content='jai87@gmail.com', metadata={'source': '/Users/Manas/FreshersProjects/GenAI-Learning/Self-Learning/sample.json', 'seq_num': 3})]

## Public Dataset or Service Loaders

### Wikipedia Loader

In [9]:
from langchain.document_loaders import WikipediaLoader
 
# Load content from Wikipedia using WikipediaLoader
loader = WikipediaLoader("Machine_learning")
document = loader.load()

In [10]:
document[0].page_content

'Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without explicit instructions. Recently, generative artificial neural networks have been able to surpass many previous approaches in performance.Machine learning approaches have been applied to many fields including large language models, computer vision, speech recognition, email filtering, agriculture, and medicine, where it is too costly to develop algorithms to perform the needed tasks. ML is known in its application across business problems under the name predictive analytics. Although not all machine learning is statistically based, computational statistics is an important source of the field\'s methods.\nThe mathematical foundations of ML are provided by mathematical optimization (mathematical programming) methods. Data mining is a related (parallel) field of study, 

In [11]:
document[0].metadata

{'title': 'Machine learning',
 'summary': "Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without explicit instructions. Recently, generative artificial neural networks have been able to surpass many previous approaches in performance.Machine learning approaches have been applied to many fields including large language models, computer vision, speech recognition, email filtering, agriculture, and medicine, where it is too costly to develop algorithms to perform the needed tasks. ML is known in its application across business problems under the name predictive analytics. Although not all machine learning is statistically based, computational statistics is an important source of the field's methods.\nThe mathematical foundations of ML are provided by mathematical optimization (mathematical programming) methods. Data mining

### IMDB Movie Script Loader

In [20]:
from langchain_community.document_loaders import IMSDbLoader

loader = IMSDbLoader("https://imsdb.com/scripts/BlacKkKlansman.html")

data = loader.load()

In [21]:
# Remove unnecessary newlines and carriage returns
formatted_text = data[0].page_content[:5000].strip()

# Print the formatted text
print(formatted_text)

BLACKKKLANSMAN
                         
                         
                         
                         
                                      Written by

                          Charlie Wachtel & David Rabinowitz

                                         and

                              Kevin Willmott & Spike Lee








                         FADE IN:
                         
          SCENE FROM "GONE WITH THE WIND"
                         
          Scarlett O'Hara, played by Vivian Leigh, walks through the
          Thousands of injured Confederate Soldiers pulling back to
          reveal the Famous Shot of the tattered Confederate Flag in
          "Gone with the Wind" as The Max Stein Music Score swells from
          Dixie to Taps.
                         
                                   BEAUREGARD- KLAN NARRATOR (O.S.)
                       They say they may have lost the
                       Battle but they didn't lose The War.
                  

### YouTubeLoader

In [33]:
#!pip install --upgrade --quiet  youtube-transcript-api


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [54]:
from langchain_community.document_loaders import YoutubeLoader

loader = YoutubeLoader.from_youtube_url(
    "https://www.youtube.com/watch?v=O7xH9ZSp_B4", add_video_info=False
)

data = loader.load()

In [55]:
# Remove unnecessary newlines and carriage returns
formatted_text = data[0].page_content[:5000].strip()

# Print the formatted text
print(formatted_text)

this little wafer contains photosensitive pixels made with a semiconductor that is fabricated from copper and I'm not going to lie this wafer has totally kicked my butt results have been at times exciting this does that look at that and at times pretty discouraging that's a week's worth of work down the drain and I get to start over again which is just great and more than a few times I've accidentally blown up what I was testing oh it went this will probably be a long one so buckle up today we're talking about DIY semiconductors to make advanced electronics and light sensitive devices you need a semiconducting material and while silicon is obviously the most popular semiconductor it's not the only one there are dozens of different semiconducting materials out there being studied for a whole variety of reasons but the one I'm most interested in is copper oxide the oxidized state of plain old copper copper oxide is a ptype semiconductor with a direct band gap of 1.2 electron volts we don

#### Add Video preferences, Add language preferences
- Language param : It’s a list of language codes in a descending priority, en by default.
- translation param : It’s a translate preference, you can translate available transcript to your preferred language.

In [56]:
loader = YoutubeLoader.from_youtube_url(
    "https://www.youtube.com/watch?v=fYNxG8RwpaE",
    add_video_info=True,
    language=["en", "pt"],
    translation="pt",
)
ytdata = loader.load()

In [62]:
ytdata[0].dict()['page_content']

"bem-vindo de volta à série de teclado personalizado do zero, da última vez que criamos nosso layout, definimos nosso switch Matrix selecionou nosso microcontrolador e escolheu nossos recursos opcionais, se você não tiver feito alguns deles, tudo bem, mas você terá que fazê-los eventualmente para a placa de demonstração  Estou criando, estou usando um layout padrão de 60% no Mega 32 u4 e incluirei pinos para um display OLED neste vídeo criaremos o esquemático e o PCB que começa com a chave CAD kycad é um software gratuito e de código aberto para  projetando placas de circuito impresso e esquemas instale a versão mais recente do kycad e crie um novo projeto, você será saudado por dois arquivos, um esquema e um PCB, abra o esquema, vamos primeiro apresentar as teclas de atalho mais úteis e permite adicionar componentes gerais, como  resistores diodos interruptores e microcontroladores M permite que você mova um símbolo R permite que você gire esse símbolo e P permite que você coloque sím

In [48]:
# Remove unnecessary newlines and carriage returns
formatted_text = ytdata[0].page_content[:5000].strip()

# Print the formatted text
print(formatted_text)

welcome back to the custom keyboard from  scratch series last time we created our  layout defined our switch Matrix  selected our microcontroller and picked  our optional features if you haven't  done some of these that's okay but you  will have to do them eventually for the  demo board I'm creating I'm using a  standard 60% layout the at Mega 32 u4  and will include pins for an OLED  display in this video we'll create the  schematic and PCB which starts with key  CAD kycad is free and open source  software for designing printed circuit  boards and schematics install the newest  version of kycad then create a new  project you'll be greeted by two files a  schematic and a PCB open up the  schematic let's first introduce the most  useful hotkeys a allows you to add  General components such as resistors  diodes switches and microcontrollers M  allows you to move a symbol R allows you  to rotate that symbol and P allows you  to place power symbols such as voltage  rails and ground now that

## Text Splitters

Once you've loaded documents, you'll often want to transform them to better suit your application. The simplest example is you may want to split a long document into smaller chunks that can fit into your model's context window. LangChain has a number of built-in document transformers that make it easy to split, combine, filter, and otherwise manipulate documents.

When you want to deal with long pieces of text, it is necessary to split up that text into chunks. As simple as this sounds, there is a lot of potential complexity here. Ideally, you want to keep the semantically related pieces of text together. What "semantically related" means could depend on the type of text. This notebook showcases several ways to do that.

At a high level, text splitters work as following:

- Split the text up into small, semantically meaningful chunks (often sentences).
- Start combining these small chunks into a larger chunk until you reach a certain size (as measured by some function).
- Once you reach that size, make that chunk its own piece of text and then start creating a new chunk of text with some overlap (to keep context between chunks).

That means there are two different axes along which you can customize your text splitter:

- How the text is split
- How the chunk size is measured

In [37]:
from langchain.text_splitter import CharacterTextSplitter

text_splitter = CharacterTextSplitter(
    separator="\n\n",
    chunk_size=200,
    chunk_overlap=20,
    length_function=len,
    is_separator_regex=False,
)

In [38]:
loader = WebBaseLoader("https://www.ibm.com/")
data = loader.load()

In [39]:
chunks = text_splitter.split_text(data[0].page_content)
len(chunks)

Created a chunk of size 265, which is longer than the specified 200


28

In [40]:
for chunk in chunks:
    print(chunk)
    print('----')

IBM - United States


Wimbledon fans enjoy a champion AI experience
----
Turning backhands into insights, providing match summaries and making statistics available to fans—the IBM client partnership is truly unmatched
  


    

See IBM at Wimbledon
----
See IBM at Wimbledon


Explore the AI experience



                                


  
  
      Latest news
----
IBM completes acquisition of StreamSets and webMethods

HCLTech and IBM Announce Generative AI Center of Excellence to Support Clients with Customized AI Solutions
----
IBM Consulting and Microsoft Collaborate to Help Clients Modernize Security Operations and Protect Against Cloud Identity Threats
----
IBM Completes Acquisition of StreamSets and webMethods, Bolstering its Automation, Data and AI Portfolios
----
Crédit Mutuel Alliance Fédérale Accelerates Deployment of Generative AI in Collaboration with IBM
----
IBM Study: Fan Engagement and Consumption of Sports Shifting, Reveals New Opportunities for Technology Integrat

In [41]:
documents = text_splitter.create_documents([data[0].page_content])
len(documents)

Created a chunk of size 265, which is longer than the specified 200


28

In [42]:
for doc in documents:
    print(doc)
    print('----')

page_content='IBM - United States


Wimbledon fans enjoy a champion AI experience'
----
page_content='Turning backhands into insights, providing match summaries and making statistics available to fans—the IBM client partnership is truly unmatched
  


    

See IBM at Wimbledon'
----
page_content='See IBM at Wimbledon


Explore the AI experience



                                


  
  
      Latest news'
----
page_content='IBM completes acquisition of StreamSets and webMethods

HCLTech and IBM Announce Generative AI Center of Excellence to Support Clients with Customized AI Solutions'
----
page_content='IBM Consulting and Microsoft Collaborate to Help Clients Modernize Security Operations and Protect Against Cloud Identity Threats'
----
page_content='IBM Completes Acquisition of StreamSets and webMethods, Bolstering its Automation, Data and AI Portfolios'
----
page_content='Crédit Mutuel Alliance Fédérale Accelerates Deployment of Generative AI in Collaboration with IBM'
----
page_c

## RecursiveCharacterTextSplitter

This text splitter is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough. The default list is ["\n\n", "\n", " ", ""]. This has the effect of trying to keep all paragraphs (and then sentences, and then words) together as long as possible, as those would generically seem to be the strongest semantically related pieces of text.
- How the text is split: by list of characters.
- How the chunk size is measured: by number of characters.
- The RecursiveCharacterTextSplitter class does use chunk_size and overlap parameters to split the text into chunks of the specified size and overlap. This is because its split_text method recursively splits the text based on different separators until the length of the splits is less than the chunk_size.

In [43]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

rectext_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size=100,
    chunk_overlap=20,
    length_function=len,
    is_separator_regex=False,
)

In [44]:
texts = rectext_splitter.create_documents([data[0].page_content])

In [45]:
for text in texts:
    print(text)
    print("-----")

page_content='IBM - United States'
-----
page_content='Wimbledon fans enjoy a champion AI experience'
-----
page_content='Turning backhands into insights, providing match summaries and making statistics available to'
-----
page_content='available to fans—the IBM client partnership is truly unmatched'
-----
page_content='See IBM at Wimbledon


Explore the AI experience'
-----
page_content='Latest news'
-----
page_content='IBM completes acquisition of StreamSets and webMethods'
-----
page_content='HCLTech and IBM Announce Generative AI Center of Excellence to Support Clients with Customized AI'
-----
page_content='with Customized AI Solutions'
-----
page_content='IBM Consulting and Microsoft Collaborate to Help Clients Modernize Security Operations and Protect'
-----
page_content='and Protect Against Cloud Identity Threats'
-----
page_content='IBM Completes Acquisition of StreamSets and webMethods, Bolstering its Automation, Data and AI'
-----
page_content='Data and AI Portfolios'
-----
