# Python DOCX Library

## Opening a document

[Source](https://python-docx.readthedocs.io/en/latest/user/quickstart.html)

First thing you’ll need is a document to work on. The easiest way is this:

In [1]:
from docx import Document

document = Document()

This opens up a blank document based on the default “template”, pretty much what you get when you start a new document in Word using the built-in defaults. You can open and work on an existing Word document using python-docx, but we’ll keep things simple for the moment.

### Adding a paragraph

Paragraphs are fundamental in Word. They’re used for body text, but also for headings and list items like bullets.

Here’s the simplest way to add one:

In [2]:
paragraph = document.add_paragraph('Lorem ipsum dolor sit amet.')

This method returns a reference to a paragraph, newly added paragraph at the end of the document.

It’s also possible to use one paragraph as a “cursor” and insert a new paragraph directly above it:

In [5]:
prior_paragraph = paragraph.insert_paragraph_before('Lorem ipsum')

This allows a paragraph to be inserted in the middle of a document, something that’s often important when modifying an existing document rather than generating one from scratch.

### Adding a heading

In anything but the shortest document, body text is divided into sections, each of which starts with a heading. Here’s how to add one:

In [6]:
document.add_heading('The REAL meaning of the universe')

<docx.text.paragraph.Paragraph at 0x7fd36cc68dc0>

By default, this adds a top-level heading, what appears in Word as ‘Heading 1’. When you want a heading for a sub-section, just specify the level you want as an integer between 1 and 9:

In [7]:
document.add_heading('The role of dolphins', level=2)

<docx.text.paragraph.Paragraph at 0x7fd36d2dd5b0>

If you specify a level of 0, a “Title” paragraph is added. This can be handy to start a relatively short document that doesn’t have a separate title page.

### Adding a page break

Every once in a while you want the text that comes next to go on a separate page, even if the one you’re on isn’t full. A “hard” page break gets this done:

In [8]:
document.add_page_break()

<docx.text.paragraph.Paragraph at 0x7fd36d2dd790>

### Adding a table

One frequently encounters content that lends itself to tabular presentation, lined up in neat rows and columns. Word does a pretty good job at this. Here’s how to add a table:

In [9]:
table = document.add_table(rows=2, cols=2)

**Tables have several properties and methods you’ll need in order to populate them. Accessing individual cells is probably a good place to start. As a baseline, you can always access a cell by its row and column indicies:**

In [10]:
cell = table.cell(0, 1)

**This gives you the right-hand cell in the top row of the table we just created. Note that row and column indicies are zero-based, just like in list access.**

**Once you have a cell, you can put something in it:**

In [11]:
cell.text = 'parrot, possibly dead'

**Frequently it’s easier to access a row of cells at a time, for example when populating a table of variable length from a data source. The .rows property of a table provides access to individual rows, each of which has a .cells property. The .cells property on both Row and Column supports indexed access, like a list:**

In [12]:
row = table.rows[1]
row.cells[0].text = 'Foo bar to you.'
row.cells[1].text = 'And a hearty foo bar to you too sir!'

**The .rows and .columns collections on a table are iterable, so you can use them directly in a for loop. Same with the .cells sequences on a row or column:**

In [13]:
for row in table.rows:
    for cell in row.cells:
        print(cell.text)


parrot, possibly dead
Foo bar to you.
And a hearty foo bar to you too sir!


If you want a count of the rows or columns in the table, just use len() on the sequence:

In [14]:
row_count = len(table.rows)
col_count = len(table.columns)

You can also add rows to a table incrementally like so:

In [15]:
row = table.add_row()

This can be very handy for the variable length table scenario we mentioned above:

In [20]:
# get table data -------------
records = (
    (3, '101', 'Spam'),
    (7, '422', 'Eggs'),
    (4, '631', 'Spam, spam, eggs, and spam')
)

table = document.add_table(rows=1, cols=3)
hdr_cells = table.rows[0].cells
hdr_cells[0].text = 'Qty'
hdr_cells[1].text = 'Id'
hdr_cells[2].text = 'Desc'
for qty, id, desc in records:
    row_cells = table.add_row().cells
    row_cells[0].text = str(qty)
    row_cells[1].text = id
    row_cells[2].text = desc

For adding table style, adding picture, Applying a paragraph style, applying bold and italic or Applying a character style see the continue of the [source](https://python-docx.readthedocs.io/en/latest/user/quickstart.html)

### An example

[Source](https://python-docx.readthedocs.io/en/latest/index.html#)


In [22]:
from docx import Document
from docx.shared import Inches

document = Document()

document.add_heading('Document Title', 0)

p = document.add_paragraph('A plain paragraph having some ')
p.add_run('bold').bold = True
p.add_run(' and some ')
p.add_run('italic.').italic = True

document.add_heading('Heading, level 1', level=1)
document.add_paragraph('Intense quote', style='Intense Quote')

document.add_paragraph(
    'first item in unordered list', style='List Bullet'
)
document.add_paragraph(
    'first item in ordered list', style='List Number'
)

document.add_picture('1676405834584.jpeg', width=Inches(1.25))

records = (
    (3, '101', 'Spam'),
    (7, '422', 'Eggs'),
    (4, '631', 'Spam, spam, eggs, and spam')
)

table = document.add_table(rows=1, cols=3)
hdr_cells = table.rows[0].cells
hdr_cells[0].text = 'Qty'
hdr_cells[1].text = 'Id'
hdr_cells[2].text = 'Desc'
for qty, id, desc in records:
    row_cells = table.add_row().cells
    row_cells[0].text = str(qty)
    row_cells[1].text = id
    row_cells[2].text = desc

document.add_page_break()

# document.save('demo.docx')

## Working with Documents

[SOURCE](https://python-docx.readthedocs.io/en/latest/user/documents.html)

python-docx allows you to create new documents as well as make changes to existing ones. Actually, it only lets you make changes to existing documents; it’s just that if you start with a document that doesn’t have any content, it might feel at first like you’re creating one from scratch.

This characteristic is a powerful one. A lot of how a document looks is determined by the parts that are left when you delete all the content. Things like styles and page headers and footers are contained separately from the main content, allowing you to place a good deal of customization in your starting document that then appears in the document you produce.

Let’s walk through the steps to create a document one example at a time, starting with two of the main things you can do with a document, open it and save it.

### Opening a document

The simplest way to get started is to open a new document without specifying a file to open:

In [23]:
from docx import Document

document = Document()
document.save('test.docx')

This creates a new document from the built-in default template and saves it unchanged to a file named ‘test.docx’. The so-called “default template” is actually just a Word file having no content, stored with the installed python-docx package. It’s roughly the same as you get by picking the Word Document template after selecting Word’s File > New from Template… menu item.

### REALLY opening a document

If you want more control over the final document, or if you want to change an existing document, you need to open one with a filename:

In [24]:
document = Document('NLP.docx')
# 

In [25]:
document

<docx.document.Document at 0x7fd36e501d40>

In [26]:
# save it as a different file
document.save('new-file-name.docx')

Things to note:

You can open any Word 2007 or later file this way (.doc files from Word 2003 and earlier won’t work). While you might not be able to manipulate all the contents yet, whatever is already in there will load and save just fine. The feature set is still being built out, so you can’t add or change things like headers or footnotes yet, but if the document has them python-docx is polite enough to leave them alone and smart enough to save them without actually understanding what they are.
If you use the same filename to open and save the file, python-docx will obediently overwrite the original file without a peep. You’ll want to make sure that’s what you intend.

**Opening a ‘file-like’ document**

python-docx can open a document from a so-called file-like object. It can also save to a file-like object. This can be handy when you want to get the source or target document over a network connection or from a database and don’t want to (or aren’t allowed to) interact with the file system. In practice this means you can pass an open file or StringIO/BytesIO stream object to open or save a document like so:

The 'rb' file open mode parameter isn’t required on all operating systems. It defaults to 'r' which is enough sometimes, but the ‘b’ (selecting binary mode) is required on Windows and at least some versions of Linux to allow Zipfile to open the file.

## Working with text

[For paragraph properties, indentation, bold, italic, margins, paragraph and line spacing and others](https://python-docx.readthedocs.io/en/latest/user/text.html)

[Working with Sections](https://python-docx.readthedocs.io/en/latest/user/sections.html)

[Working with Headers and Footers](https://python-docx.readthedocs.io/en/latest/user/hdrftr.html)

[Understanding Styles](https://python-docx.readthedocs.io/en/latest/user/styles-understanding.html)

[Working with styles](https://python-docx.readthedocs.io/en/latest/user/styles-using.html)

[Official python docx documentation for all](https://python-docx.readthedocs.io/en/latest/index.html#)