# Python - Jupyter Playground
This notebook contains basic python stuff that is useful in general purposes.

## f-strings
With special formatting and padding

In [1]:
from datetime import datetime
library = [('Author', 'Topic', 'Pages'), ('Twain', 'Rafting', 601), ('Feynman', 'Physics', 95), ('Hamilton', 'Mythology', 144)]
date = datetime(year=2019, month=8, day=8)

for author, topic, pages in library:
    print(f"{author:{10}}{topic:{30}}{pages:->10}")
    
print(f"\n{date:%A, %B %Y}")

Author    Topic                         -----Pages
Twain     Rafting                       -------601
Feynman   Physics                       --------95
Hamilton  Mythology                     -------144

Thursday, August 2019


## File manipulation within jupyter

In [2]:
%%writefile test.txt
Here I can write anything I want and the content will be written into the file specified in the file above.
This is a cool magic.

Overwriting test.txt


In [3]:
! ls ./

 new_pdf.pdf  'Python - playground.ipynb'   test.txt


In [4]:
! cat ./test.txt

Here I can write anything I want and the content will be written into the file specified in the file above.
This is a cool magic.


I can check the current directory within Jupyter with

In [5]:
pwd

'/home/ohtar10-kudu/git/nlp-python-training/notebooks'

When invoking a method we can do `shift+tab` to get help within jupyter.

In [6]:
with open('test.txt', 'r') as file:
    print(file.readlines())

['Here I can write anything I want and the content will be written into the file specified in the file above.\n', 'This is a cool magic.\n']


## PDF manipulation
Using PyPDF2

In [11]:
! ls ../UPDATED_NLP_COURSE/00-Python-Text-Basics

00-Working-with-Text-Files.ipynb		  Business_Proposal.pdf
01-Working-with-PDF-Text.ipynb			  Some_New_Doc.pdf
02-Regular-Expressions.ipynb			  test.txt
03-Python-Text-Basics-Assessment.ipynb		  US_Declaration.pdf
04-Python-Text-Basics-Assessment-Solutions.ipynb


In [12]:
! pip install PyPDF2

7.4.0
Collecting PyPDF2
[?25l  Downloading https://files.pythonhosted.org/packages/b4/01/68fcc0d43daf4c6bdbc6b33cc3f77bda531c86b174cac56ef0ffdb96faab/PyPDF2-1.26.0.tar.gz (77kB)
[K    100% |████████████████████████████████| 81kB 538kB/s ta 0:00:011
[?25hBuilding wheels for collected packages: PyPDF2
  Running setup.py bdist_wheel for PyPDF2 ... [?25ldone
[?25h  Stored in directory: /home/ohtar10-kudu/.cache/pip/wheels/53/84/19/35bc977c8bf5f0c23a8a011aa958acd4da4bbd7a229315c1b7
Successfully built PyPDF2
Installing collected packages: PyPDF2
Successfully installed PyPDF2-1.26.0


In [13]:
import PyPDF2

pdf_path = '../UPDATED_NLP_COURSE/00-Python-Text-Basics/US_Declaration.pdf'
with open(pdf_path, 'rb') as pdf_file: # rb for read bytes
    # create a pdf object from the file
    pdf_reader = PyPDF2.PdfFileReader(pdf_file)
    print(f"Number of pages in pdf file {pdf_reader.numPages}")
    print(f"First lines of the first page: \n{pdf_reader.getPage(0).extractText()[:100]}...")

Number of pages in pdf file 5
First lines of the first page: 
Declaration of IndependenceIN CONGRESS, July 4, 1776. The unanimous Declaration of the thirteen unit...


Adding a new page to an existing PDF file

In [14]:
with open(pdf_path, 'rb') as pdf_file:
    pdf_reader = PyPDF2.PdfFileReader(pdf_file)
    pdf_writer = PyPDF2.PdfFileWriter()
    pdf_writer.addPage(pdf_reader.getPage(0))
    with open('new_pdf.pdf', 'wb') as new_pdf:
        pdf_writer.write(new_pdf)

In [15]:
! ls

 new_pdf.pdf  'Python - playground.ipynb'   test.txt


Read all the text from the pdf

In [16]:
with open(pdf_path, 'rb') as pdf_file:
    pdf_reader = PyPDF2.PdfFileReader(pdf_file)
    text = []
    for p in range(pdf_reader.numPages):
        text.append(pdf_reader.getPage(p).extractText())
    print(f"First page:\n{text[0]}")

First page:
Declaration of IndependenceIN CONGRESS, July 4, 1776. The unanimous Declaration of the thirteen united States of America, When in the Course of human events, it becomes necessary for one people to dissolve the
political bands which have connected them with another, and to assume among the powers of the
earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle

them, a decent respect to the opinions of mankind requires that they should declare the causes

which impel them to the separation. 
We hold these truths to be self-evident, that all men are created equal, that they are endowed by

their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit
of Happiness.ŠThat to secure these rights, Governments are instituted among Men, deriving

their just powers from the consent of the governed,ŠThat whenever any Form of Government
becomes destructive of these ends, it is the Right of the People to alter or to