In [None]:
print('Hello World')

# **Starting with Python Basic**

# Python Objects and Data Structures

We'll learn about the following topics:

    1.) Variable
    2.) Lists
    3.) Dictionaries
    4.) Tuple
    5.) Sets

## Variable

A variable in a programming language is like a labeled box that stores a piece of information. You can use this labeled box to hold different types of data, such as numbers, text, or other values. By using variables, you can easily refer to and manipulate this information in your code.

Please consider the following rules before naming the variables:

    1.) Names can not start with a number.
    2.) There can be no spaces in the name, use _ instead.
    3.) Can't use any of these symbols :'",<>/?|\()!@#$%^&*~-+
    4.) It's considered best practice that names are lowercase.
    5.) Avoid using the characters 'l' (lowercase letter el), 'O' (uppercase letter oh),
       or 'I' (uppercase letter eye) as single character variable names.
    6.) Avoid using words that have special meaning in Python like "list" and "str"
    
Using variable names can be a very useful way to keep track of different variables in Python. For example:

In [None]:
first_name = 'Sam'

In [None]:
type(first_name)

In [None]:
#assigning different number types to different variables
my_income = 100
tax_rate = 0.1

In [None]:
type(my_income)

**TIP: Use object names to keep better track of what's going on in your code!**

In [None]:
#performing calculation by using variables
my_taxes = my_income*tax_rate

In [None]:
my_taxes,my_income

**OR**

In [None]:
print(my_taxes,my_income)

### Task 1

Calculate the area of a rectangle having length of 15 cm and width of 10 cm. Use variable assignment and perform calculations using variables. Also display the result.

In [None]:
# Assign values to variables
length = 15
width = 10

In [None]:
# Perform Calculation
area = length*width

In [None]:
# Display the result
area

# Lists

Lists can be thought of the most general version of a *sequence* in Python.They are mutable, meaning the elements inside a list can be changed!

In this section we will learn about:
    
    1.) Creating lists
    2.) Indexing and Slicing Lists
    3.) Nesting Lists

Lists are constructed with brackets [] and commas separating every element in the list.

Let's go ahead and see how we can construct lists!

In [None]:
my_list = [ ]

In [None]:
type(my_list)

In [None]:
# Assign a list to an variable named my_list
my_list=[1,2,3]

In [None]:
my_list

We just created a list of integers, but lists can actually hold different object types. For example:

In [None]:
my_list = ['A string',23,100.232,'o']

In [None]:
my_list

### Indexing and Slicing
We know Lists are a sequence, which means Python can use indexes to call parts of the sequence. Let's learn how this works.

In Python, we use brackets <code>[]</code> after an object to call its index. We should also note that indexing starts at 0 for Python.

We can use a <code>:</code> to perform *slicing* which grabs everything up to a designated point.

Let's create a new object called <code>my_list</code> and then walk through a few examples of indexing.. Let's make a new list to remind ourselves of how this works:

In [None]:
my_list = ['one','two','three',4,5]

In [None]:
# Grab element at index 0
my_list[0]

In [None]:
# Grab index 1 and everything past it
my_list[1:]

In [None]:
my_list[:3]

You can always access the indices in reverse. For example working according to the index, <code>my_list[0]</code> will be the first item and <code>my_list[-1]</code> will be the last one. Try the fowwlowing code.

In [None]:
# Grab the last index in reverse
my_list[-1]

**Try yourself!**

In [None]:
# Grab the second last index in reverse
my_list[-2]

In [None]:
my_list.append(6)

In [None]:
my_list

In [None]:
my_list.insert(0,1)

In [None]:
my_list

# Dictionaries

Dictionary stores information by pairing up keys with values. Each key must be unique, but values can be anything.

This section will serve as a brief introduction to dictionaries and consist of:

    1.) Constructing a Dictionary
    2.) Accessing objects from a dictionary

## Constructing a Dictionary
Let's see how we can construct dictionaries to get a better understanding of how they work!

In [None]:
# Make a dictionary with {} and : to signify a key and a value
my_dict = {'key1':'value1','key2':'value2'}

In [None]:
my_dict

In [None]:
# Call values by their key
my_dict['key1']

 **Task: Access 'key2' value**

In [None]:
# write code here

# Tuples

In Python tuples are very similar to lists, however, unlike lists they are *immutable* meaning they can not be changed. You would use tuples to present things that shouldn't be changed, such as days of the week, or dates on a calendar.

In this section, we will get a brief overview of the following:

    1.) Constructing Tuples
    2.) Immutability


## Constructing Tuples

The construction of a tuples use () with elements separated by commas. For example:

In [None]:
# Create a tuple
t = (1,2,3)

In [None]:
# Can also mix object types
t = ('one',2)

# Show
t

## Immutability

In [None]:
t[0]= 'change'

# Sets

Sets are an unordered collection of *unique* elements. We can construct them by using the set() function. Let's go ahead and make a set to see how it works

In [None]:
x = set()

In [None]:
type(x)

In [None]:
# We add to sets with the add() method
x.add(1)

In [None]:
#Show
x

# Control Flow and Functions

### if, elif, else Statements

<code>if</code> Statements in Python allows us to tell the computer to perform alternative actions based on a certain set of results.

Verbally, we can imagine we are telling the computer:

"Hey if this case happens, perform some action"

We can then expand the idea further with <code>elif</code> and <code>else</code> statements, which allow us to tell the computer:

"Hey if this case happens, perform some action. Else, if another case happens, perform some other action. Else, if *none* of the above cases happened, perform this action."

Let's go ahead and look at the syntax format for <code>if</code> statements to get a better idea of this:

    if case1:
        perform action1
    elif case2:
        perform action2
    else:
        perform action3

## First Example

Let's see a quick example of this:

In [None]:
if True:
    print('It was true!')

In [None]:
x = False

if x:
    print('x was True!')
else:
    print('I will be printed in any case where x is not true')
    print('Here it is')

In [None]:
loc = 'Bank'

if loc == 'Auto Shop':
    print('Welcome to the Auto Shop!')

elif loc == 'Bank':
    print('Welcome to the bank!')

else:
    print('Where are you?')

# for Loops

A <code>for</code> loop acts as an iterator in Python; it goes through items that are in a *sequence* or any other iterable item. Objects that we've learned about that we can iterate over include strings, lists, tuples, and even built-in iterables for dictionaries, such as keys or values.


Here's the general format for a <code>for</code> loop in Python:

    for item in object:
        statements to do stuff

The variable name used for the item is completely up to the coder, so use your best judgment for choosing a name that makes sense and you will be able to understand when revisiting your code. This item name can then be referenced inside your loop, for example if you wanted to use <code>if</code> statements to perform checks.

Let's go ahead and work through several example of <code>for</code> loops using a variety of data object types. We'll start simple and build more complexity later on.

## Example 1
Iterating through a list

In [None]:
# We'll learn how to automate this sort of list in the next lecture
list1 = [1,2,3,4,5,6,7,8,9,10]
list1

In [None]:
for value in list1:
    print(value)

In [None]:
# using enumerate key
for index, value in enumerate(list1):
    print(index,value)

## Example 2
Let's print only the even numbers from that list!

In [None]:
for num in list1:
    if num % 2 != 0:
        print(num)

# while Loops

The <code>while</code> statement in Python is one of most general ways to perform iteration. A <code>while</code> statement will repeatedly execute a single statement or group of statements as long as the condition is true. The reason it is called a 'loop' is because the code statements are looped through over and over again until the condition is no longer met.

The general format of a while loop is:

    while test:
        code statements
    else:
        final code statements

Letâ€™s look at a few simple <code>while</code> loops in action.

In [None]:
x = int(input("enter a number"))

while x < 10:
    print('x is currently: ',x,end= " ")
    print(' x is still less than 10, adding 1 to x')
    x+=1

else:
    print('All Done!')

# Functions

## Introduction to Functions

**So what is a function?**

Formally, a function is a useful device that groups together a set of statements so they can be run more than once. They can also let us specify parameters that can serve as inputs to the functions.

On a more fundamental level, functions allow us to not have to repeatedly write the same code again and again.

Functions will be one of most basic levels of reusing code in Python, and it will also allow us to start thinking of program design.

In [None]:
def name_of_function(arg1,arg2):
    '''
    This is where the function's Document String (docstring) goes
    '''

In [None]:
name_of_function(1,2)

### Example 1: A simple print 'hello' function

In [None]:
def say_hello():
    print('hello')

In [None]:
say_hello()

In [None]:
def add_num(num1,num2):
    #return f"{num1+num2}"
    return "Result is :"+str(num1 + num2)

In [None]:
# Can save as variable due to return
result = add_num(num1=4,num2=5)

In [None]:
result

In [None]:
type(add_num(num1=4,num2=5))

# Functions exercise
Write a function to check if a number is even or odd.

In [None]:
# define the function
def iseven(num):
  if num % 2 == 0:
    print("Even")
  else:
    print("Odd")

In [None]:
# Taking input from user to check either the number is even or odd
def iseven():
  num = int(input("Enter your number"))
  if num % 2 == 0:
    print("Even")
  else:
    print("Odd")


In [None]:
# call the function
iseven()

# Reading  Files

## Text File

In [None]:
file = open('genai.txt', 'r')

# Reading the content of the file
content = file.read()

# Printing the content
print(content)

# Closing the file
file.close()

1. **Opening the file:** open('genai.txt', 'r') opens the file named
"example.txt" in read mode.
2. **Reading the content:** file.read() reads the entire content of the file into a variable named content.
3. **Printing the content:** print(content) prints the content to the screen.
4. **Closing the file:** file.close() closes the file to free up resources.

## Task
using any txt file with content. write a Python script that reads the contents and prints them out.

### **Reading PDF file**


**PyPDF2**  is a Python library used to work with PDF files. It allows you to read, manipulate, and write PDF documents. You can use it to extract text, merge pages, split documents, and more.

Documentation Link : [click to open](https://pypi.org/project/PyPDF2/)

In [None]:
!pip install PyPDF2

In [None]:
import PyPDF2

In [None]:
with open('./genai.pdf', 'rb') as file:

    reader = PyPDF2.PdfReader(file)

    # Get the number of pages
    num_pages = len(reader.pages)

    # Iterate through all pages and extract text
    for page_num in range(num_pages):
        page = reader.pages[page_num]
        text = page.extract_text()

        # Print the text from each page
        print(f"Page {page_num + 1}:")
        print(text)
        print("\n")

## Task
using any pdf file with content. write a Python script that reads the contents and prints them out.

In [None]:
# wirte your code here

## Docx File


### **python-docx:**
python-docx is a Python library used for creating, modifying, and extracting information from Microsoft Word (.docx) files.

**Documentation Link** : [click to open](https://pypi.org/project/python-docx/)

In [None]:
#!pip install python-docx

In [None]:
import docx

In [None]:
# Function to read text from a .docx file
def read_docx(file_path):
    doc = docx.Document(file_path)
    full_text = []
    for para in doc.paragraphs:
        full_text.append(para.text)
    return "\n".join(full_text)

In [None]:
docx_file = 'genai.docx'  # Replace with your .docx file path
text = read_docx(docx_file)
print(text)

## Task
using any docx file with content. write a Python script that reads the contents and prints them out.

In [None]:
# write your code here

## Reading Text from Image

In [None]:
#!pip install pytesseract
#!pip install  Pillow

### **Pillow:**
Pillow is a Python Imaging Library (PIL) that adds image processing capabilities to your Python interpreter. It allows you to open, manipulate, and save many different image file formats. Common tasks include image resizing, cropping, drawing, and format conversion.

**Documentation link** : [click to open](https://pypi.org/project/pillow/)

### **Tesseract:**
Tesseract is an open-source Optical Character Recognition (OCR) engine developed by Google. It is designed to convert images containing text into machine-readable text. Tesseract can recognize text in various languages and is widely used for extracting text from scanned documents, photographs, and other image types.

**Documentation link** : [click to open](https://pypi.org/project/pytesseract/)

In [None]:
#!apt-get update
#!apt-get install -y tesseract-ocr
#!apt-get install -y libtesseract-dev

In [None]:
from PIL import Image
import pytesseract

In [None]:
# Open an image file
image_path = '/content/What-Is-Generative-AI.jpg'
img = Image.open(image_path)

In [None]:
img

In [None]:
# Use pytesseract to do OCR on the image
text = pytesseract.image_to_string(img)

In [None]:
print(text)

### **Unstructured.io**
The unstructured library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of unstructured revolve around streamlining and optimizing the data processing workflow for LLMs. unstructured modular functions and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and efficient in transforming unstructured data into structured outputs

**Documentation Link**: [click to open](https://github.com/Unstructured-IO/unstructured)


In [None]:
#!pip install "unstructured[all-docs]"

**Reading pdf file using Unstructured.io**

In [None]:
from unstructured.partition.pdf import partition_pdf

In [None]:
pdf_elements = partition_pdf("/content/genai.pdf")

In [None]:
print("\n\n".join([str(el) for el in pdf_elements]))

**Reading Docx file using Unstructured.io**

In [None]:
from unstructured.partition.docx import partition_docx

In [None]:
doc_elements = partition_docx("/content/genai.docx")

In [None]:
print("\n\n".join([str(el) for el in doc_elements]))