# MS Word using docx
I found the python-docx module on [anaconda.com](https://anaconda.org/cjs14/python-docx) and installed like this

```
conda install -c https://conda.anaconda.org/cjs14 python-docx
```

In [1]:
import docx

In [2]:
doc = docx.Document('data/demo.docx')
len(doc.paragraphs)

7

In [3]:
doc.paragraphs[0].text

u'Document Title'

In [4]:
doc.paragraphs[1].text

u'A plain paragraph with some bold and some italic'

In [5]:
len(doc.paragraphs[1].runs)

5

In [6]:
doc.paragraphs[1].runs[0].text

'A plain paragraph with'

In [7]:
doc.paragraphs[1].runs[1].text

' some '

In [8]:
doc.paragraphs[1].runs[2].text

'bold'

In [10]:
doc.paragraphs[1].runs[3].text

' and some '

In [11]:
doc.paragraphs[1].runs[4].text

'italic'

## Getting the Full Text from a .docx File
If you care only about the text, not the styling information, in the Word document, you can use the `getText()` function.

In [12]:
def getText(filename):
    doc = docx.Document(filename)
    fullText = []
    for para in doc.paragraphs:
        fullText.append(para.text)
    return '\n'.join(fullText)

In [14]:
print(getText('data/demo.docx'))

Document Title
A plain paragraph with some bold and some italic
Heading, level 1
Intense quote
first item in unordered list
first item in ordered list




## Styles

In [15]:
doc.paragraphs[0].text

u'Document Title'

In [16]:
doc.paragraphs[0].style

_ParagraphStyle('Title') id: 4370634704

In [18]:
doc.paragraphs[0].style = 'Normal'
doc.paragraphs[1].text

u'A plain paragraph with some bold and some italic'

In [19]:
(doc.paragraphs[1].runs[0].text, doc.paragraphs[1].runs[1].text, doc.
paragraphs[1].runs[2].text, doc.paragraphs[1].runs[3].text)

('A plain paragraph with', ' some ', 'bold', ' and some ')

In [24]:
doc.paragraphs[1].runs[0].style = 'QuoteChar'
doc.paragraphs[1].runs[1].underline = True
doc.paragraphs[1].runs[3].underline = True
doc.save('data/restyled.docx')

## Writing Word Documents

In [25]:
import docx
doc = docx.Document()
doc.add_paragraph('Hello world!')
doc.save('data/helloworld.docx')

In [26]:
import docx
doc = docx.Document()
doc.add_paragraph('Hello world!')
paraObj1 = doc.add_paragraph('This is a second paragraph.')
paraObj2 = doc.add_paragraph('This is a yet another paragraph.')
paraObj1.add_run(' This text is being added to the second paragraph.')
doc.save('data/multipleParagraphs.docx')

## Adding Headings

In [27]:
doc = docx.Document()
doc.add_heading('Header 0', 0)
doc.add_heading('Header 1', 1)
doc.add_heading('Header 2', 2)
doc.add_heading('Header 3', 3)
doc.add_heading('Header 4', 4)
doc.save('data/headings.docx')

## Inserting graphics

In [29]:
doc.add_picture('images/amazon.png', width=docx.shared.Inches(1),
height=docx.shared.Cm(4))

<docx.shape.InlineShape at 0x103ebdf10>