Q.1. In what modes should the PdfFileReader() and PdfFileWriter() File objects will be opened?

A.1. When working with PdfFileReader and PdfFileWriter objects from the PyPDF2 library, we need to open the file objects in the following modes:

1. PdfFileReader:
To read a PDF file, open it in read-binary mode (rb).
2. PdfFileWriter:
To perform write operations on a PDF file (such as merging or adding pages), open it in write-binary mode (wb).

Q.2. From a PdfFileReader object, how do you get a Page object for page 5?

A.5. To retrieve a Page object for page 5 from a PdfFileReader object in Python:

First, create a PdfFileReader object by opening the PDF file you want to read

import PyPDF2
#### Open the PDF file in read-binary mode (rb)
pdf_file = open('your_pdf_file.pdf', 'rb')

pdf_reader = PyPDF2.PdfFileReader(pdf_file)

2. Get the Page object for page 5 (remember that page numbering starts from 0):

page_number = 4  # Page 5 corresponds to index 4

page_object = pdf_reader.getPage(page_number)

Now we have the page_object representing page 5 from our PDF file. we can extract text, manipulate the page, or perform other operations as needed.

Q.3. What PdfFileReader variable stores the number of pages in the PDF document?

A.3.  The variable that stores the total number of pages in a PDF document using the PdfFileReader class from the PyPDF2 library is the numPages attribute.

import PyPDF2

#### Open the PDF file in read-binary mode (rb)

pdf_file = open('your_pdf_file.pdf', 'rb')

pdf_reader = PyPDF2.PdfFileReader(pdf_file)

#### Get the total number of pages

total_pages = pdf_reader.numPages

print(f"Total Pages: {total_pages}")


Q.4. If a PdfFileReader object’s PDF is encrypted with the password swordfish, what must you do
before you can obtain Page objects from it?

A.4. To obtain Page objects from a PdfFileReader object when the PDF is encrypted with the password “swordfish,”
1. Open the Encrypted PDF File
  
  import PyPDF2

#### Open the encrypted PDF file in read-binary mode (rb)

pdf_file = open('encrypted_file.pdf', 'rb')

pdf_reader = PyPDF2.PdfFileReader(pdf_file)


2. Decrypt the PDF:

password = 'swordfish'

if pdf_reader.isEncrypted:

    pdf_reader.decrypt(password)


3. Access Page Objects:

page_number = 4  # Page 5 corresponds to index 4

page_object = pdf_reader.getPage(page_number)


Q.5. What methods do you use to rotate a page?

A.5. 
- Using PyPDF2 Library:
The PyPDF2 library allows you to manipulate PDF files, including rotating pages.

- Using Pdf2image Library:
If you need to rotate an image extracted from a PDF page, you can use the pdf2image library to convert the page to an image and then rotate it using standard image manipulation techniques.
First, install the library using pip install pdf2image.

Q.6. What is the difference between a Run object and a Paragraph object?

A.6. 
1. Run Object:
- A Run object represents a contiguous run of text within a paragraph.
- It is a formatting element and is used to define the formatting attributes of a specific range of text within a paragraph.
- Attributes that can be controlled by a Run object include:
- Font style (e.g., bold, italic)

Font size

Font color

Underline

Superscript or subscript
- Runs are often used to apply character-level formatting to text within a paragraph.
- For example, if you want to make a specific word within a paragraph bold, you would create a separate Run object for that word with the desired formatting.
- Runs can be nested within a paragraph to handle different formatting within the same paragraph.
2. Paragraph Object:
- A Paragraph object represents a single paragraph of text.
- It contains one or more Run objects (which represent the formatted text within the paragraph).
- Paragraphs are used to organize text into logical units within a document.
- Attributes that can be controlled by a Paragraph object include:
- Alignment (left, center, right, justified)
- Line spacing
- Indentation
- Bulleted or numbered lists

Q.7. How do you obtain a list of Paragraph objects for a Document object that’s stored in a variable
named doc?

A.7. To obtain a list of Paragraph objects from a Document object stored in a variable named doc in Python,
1. First, ensure that you have the python-docx library installed. If not, you can install it using:

pip install python-docx

2. Next, create a Document object (representing a Word document) and manipulate it as needed. You can add paragraphs, styles, and other content to this document.
- To retrieve a list of Paragraph objects from the doc, you can access the paragraphs attribute of the Document object. This attribute contains all the paragraphs in the document, in document order.

import docx

#### Create a new Document object

doc = docx.Document()

#### Add some paragraphs to the document

doc.add_paragraph("This is the first paragraph.")

doc.add_paragraph("And here's the second paragraph.")

doc.add_paragraph("A third paragraph for demonstration.")

#### Retrieve the list of Paragraph objects

all_paragraphs = doc.paragraphs

#### Print the text content of each paragraph

for paragraph in all_paragraphs:

    print(paragraph.text)


Q.8. What type of object has bold, underline, italic, strike, and outline variables?

A.8. Text formatting object or class.

Q.9. What is the difference between False, True, and None for the bold variable?

A.9. 
1. False:
- Boolean value representing false or not true.
- Used in logical expressions, conditionals, and comparisons.
- Examples:

x = False

if x:  # This condition won't be satisfied

2. True:
- Boolean value representing true or not false.
- Used in logical expressions, conditionals, and comparisons.
- Examples:

y = True

if y:  # This condition will be satisfied

3. None:
- Represents the absence of a value or no value.
- Often used as a placeholder or default value.
- Indicates that a variable or function doesn’t have a meaningful value.
- Examples:

z = None

def my_function():  # Function with no return value pass

Q.10. How do you create a Document object for a new Word document?

A.10. 
1. Install the python-docx Library:
- If you haven’t already, install the library using pip:

pip install python-docx

2. Create a New Document Object:
- After installation, import the docx module (not python-docx).
- Use the docx.Document() class to create a new Word document:

import docx

#### Create a new Document object

doc = docx.Document()

#### Add content to the document (e.g., paragraphs, headings, images)

doc.add_heading('Heading for the document', level=0)

doc_para = doc.add_paragraph('Your paragraph goes here, ')

doc_para.add_run('hey there, bold here').bold = True

doc_para.add_run(', and ')

doc_para.add_run('these words are italic').italic = True

doc.add_page_break()

doc.add_heading('Heading level 2', level=2)

doc.add_picture('path_to_picture')  # Add an image if needed

#### Save the document to a file

doc.save('path_to_document.docx')

AI-generated code. Review and use carefully. More info on FAQ.

Replace 'path_to_picture' and 'path_to_document.docx' with actual file paths.

This example creates a simple Word document with headings, paragraphs, bold and italic text, and an image.

3. Access Existing Document:
- To open an existing Word document, create an instance of the Document class by passing the path to the document file:

from docx import Document

#### Open an existing Word document

existing_doc = Document('path_to_existing_document.docx')

#### Access paragraphs, runs, and other content within the document

for para in existing_doc.paragraphs:

    print(para.text)


Q.11.How do you add a paragraph with the text 'Hello, there!' to a Document object stored in a
variable named doc?

A.11. To add a paragraph with the text “Hello, there!” to a Document object stored in a variable named doc in Python, we can use the python-docx library.

- create a new Document object and add a paragraph with the desired text

import docx

#### Create a new Document object

doc = docx.Document()

#### Add a paragraph with the text "Hello, there!"

paragraph = doc.add_paragraph("Hello, there!")

#### Save the document to a file (optional)

doc.save("my_document.docx")



Q.12. What integers represent the levels of headings available in Word documents?

A.12. When working with Word documents using the python-docx library, the integers representing the levels of headings (also known as heading levels) are in the range from 0 to 9. These levels determine the hierarchy and formatting of headings within the document:

- Level 0: The string is printed as the title of the document.
- Levels 1 to 9: These levels represent different heading levels, with the size of the heading decreasing as the level increases.