# Convert Image To Text 

## 1. Setup

To prepare your environment, you need to install some packages and enter credentials for the Watson services.

## 1.1 Install the necessary packages

You will need to install the following packages:
PIL: The Python Imaging Library (PIL) adds image processing capabilities to your Python interpreter.
pytesseract: Python-tesseract is a python wrapper for Google's Tesseract-OCR.
ibm-cos-sdk: Object Storage library for Python

In [40]:
!pip install pytesseract

[33mYou are using pip version 9.0.1, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [41]:
!pip install ibm-cos-sdk

[33mYou are using pip version 9.0.1, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


## 1.2 Import packages and libraries
Import the packages and libraries that you'll use:

In [42]:
try:
    import Image
except ImportError:
    from PIL import Image
import pytesseract

import ibm_boto3
from ibm_botocore.client import Config

import json
import requests

## 2. Configuration
Add configurable items of the notebook below

### 2.1 Global Variables
Add global variables.

In [43]:
req_paths=['/Users/muralidhar/Desktop/Data/Form1 copy 2.jpg']
#credentials_path='C:/Users/IBM_ADMIN/credentials/credentials.json'

### 2.2 Connect to Object Storage

In [44]:
'''Creating client...
'''

with open('/Users/muralidhar/Desktop/credentials.json') as data_file:
    credentials = json.load(data_file)
print("Service credential:")
print(json.dumps(credentials, indent=2))
endpoints = requests.get(credentials.get('endpoints')).json()

Service credential:
{
  "apikey": "SeulxYm169Os5se6rmlPmPLLmEimXNvVMf2SHDPHx2fw",
  "cos_hmac_keys": {
    "access_key_id": "7008a020378340f38b8d57f3d747185b",
    "secret_access_key": "64de8f9c116d43ead2ab60eed5d748dc1fa4da04ce9a0bb8"
  },
  "endpoints": "https://cos-service.bluemix.net/endpoints",
  "iam_apikey_description": "Auto generated apikey during resource-key operation for Instance - crn:v1:bluemix:public:cloud-object-storage:global:a/f2043d7defcd090971c66795a834d43c:8e7d0cae-69cd-45ec-8430-f0389e234d95::",
  "iam_apikey_name": "auto-generated-apikey-7008a020-3783-40f3-8b8d-57f3d747185b",
  "iam_role_crn": "crn:v1:bluemix:public:iam::::role:Administrator",
  "iam_serviceid_crn": "crn:v1:bluemix:public:iam-identity::a/f2043d7defcd090971c66795a834d43c::serviceid:ServiceId-d3e3c90c-07f4-4eb3-8d17-9cef425ff3d6",
  "resource_instance_id": "crn:v1:bluemix:public:cloud-object-storage:global:a/f2043d7defcd090971c66795a834d43c:8e7d0cae-69cd-45ec-8430-f0389e234d95::"
}


In [45]:
''' Identify the region based on the region of the cloud object storage
'''
endpoints

{'identity-endpoints': {'iam-policy': 'iampap.bluemix.net',
  'iam-token': 'iam.bluemix.net'},
 'service-endpoints': {'cross-region': {'ap': {'private': {'Hong Kong': 's3.hkg-ap-geo.objectstorage.service.networklayer.com',
     'Seoul': 's3.seo-ap-geo.objectstorage.service.networklayer.com',
     'Tokyo': 's3.tok-ap-geo.objectstorage.service.networklayer.com',
     'ap-geo': 's3.ap-geo.objectstorage.service.networklayer.com'},
    'public': {'Hong Kong': 's3.hkg-ap-geo.objectstorage.softlayer.net',
     'Seoul': 's3.seo-ap-geo.objectstorage.softlayer.net',
     'Tokyo': 's3.tok-ap-geo.objectstorage.softlayer.net',
     'ap-geo': 's3.ap-geo.objectstorage.softlayer.net'}},
   'eu': {'private': {'Amsterdam': 's3.ams-eu-geo.objectstorage.service.networklayer.com',
     'Frankfurt': 's3.fra-eu-geo.objectstorage.service.networklayer.com',
     'Milan': 's3.mil-eu-geo.objectstorage.service.networklayer.com',
     'eu-geo': ' s3.eu-geo.objectstorage.service.networklayer.com'},
    'public': {'

In [46]:
''' Creating Client
'''
iam_host = (endpoints['identity-endpoints']['iam-token'])
cos_host = (endpoints['service-endpoints']['cross-region']['us']['public']['us-geo'])
api_key = credentials.get('apikey')
service_instance_id = credentials.get('resource_instance_id')
# Constrict auth and cos endpoint
auth_endpoint = "https://" + iam_host + "/oidc/token"
service_endpoint = "https://" + cos_host

In [47]:
cos = ibm_boto3.client('s3',
                    ibm_api_key_id=api_key,
                    ibm_service_instance_id=service_instance_id,
                    ibm_auth_endpoint=auth_endpoint,
                    config=Config(signature_version='oauth'),
                    endpoint_url=service_endpoint)

In [48]:
response = cos.list_buckets()
buckets = [bucket['Name'] for bucket in response['Buckets']]
print("Current Bucket List:")
print(json.dumps(buckets, indent=2))
print("---")

Current Bucket List:
[
  "imagerecognitionpattern-donotdelete-pr-7whfpase0vr47w"
]
---


In [49]:
''' Choose the desired bucket name as per your project's name on Watson Studio
'''

bucket_name='imagerecognitionpattern-donotdelete-pr-7whfpase0vr47w'

In [50]:
def put_file(filename, filecontents):
    '''Write file to Cloud Object Storage'''
    resp = cos.put_object(Bucket=bucket_name, Key=filename, Body=filecontents)
    return resp

def load_string(fileobject):
    '''Load the file contents into a Python string'''
    text = fileobject.read()
    return text

## 3. Convert
This function extracts text from the desired input image and stores in the text file

In [51]:
def convert(filename, name):
    print("name", name)
    img=Image.open(filename)
    text1 = pytesseract.image_to_string(img)
    print("Text from Image")
    print(text1)
    put_file( name, text1)

In [52]:
i=1
for f in req_paths:
    name='form-doc-'+str(i)+'.txt'
    convert(f,name)
    i=i+1

name form-doc-1.txt
Text from Image
PURCHASE AGREEMENT

THIS IS A LEGALLY BINDING CONTRACT BETWEEN PURCHASER AND SELLER.
IF YOU DO NOT UNDERSTAND IT, SEEK LEGAL ADVICE.

1. PARTIES TO CONTRACT - PROPERTY. Purchaser and Seller acknowledge that Broker is ABC
is not the limited agent of both parties to this transaction as outlined in Section III of the
Agency Agreement Addendum as authorized by Purchaser and Seller, XYZ, hereinafter
referred to as Purchaser, offers and agrees to purchase from , UVW, hereinafter referred to
as Seller, upon the terms and conditions set forth, the property legally described as:

 

 

also known as

 

 

2. EARNEST MONEY DEPOSIT‘ Earnest Money in the amount of (S )

DOLLARS Cash
Check ,unless otherwise noted herein, shall be deposited into the trust account of
the listing selling broker on the next legal banking day after

acceptance of this offer.

Other earnest money provisions:

 

 

3. PURCHASE PRICE. The total purchase price is to be (S )
DOLLARS

 

