# How to Extract PDF Tables in Python

In this tutorial, you will learn how you can extract tables in PDF using camelot library in Python. Camelot is a Python library and a command-line tool that makes it easy for anyone to extract data tables trapped inside PDF files.

###  Installation of Camelot
This part of the documentation covers the steps to install Camelot.

#### Using conda
The easiest way to install Camelot is to install it with conda, which is a package manager and environment management system for the Anaconda distribution.

`$ conda install -c conda-forge camelot-py`


#### Using pip
After installing the dependencies, which include Tkinter and ghostscript, you can simply use pip to install Camelot:

`$ pip install camelot-py[cv]`


For more information, check the [official documentation](https://camelot-py.readthedocs.io/en/master/user/install.html#install) 

The PDF used in this tutorial can be downloaded from [here](https://github.com/makozi/-Extracting-PDF-Tables-in-Python/files/3870463/foo.pdf)


In [2]:
import camelot

In [7]:
file = "foo.pdf"

In [8]:
tables= camelot.read_pdf(file)

In [9]:
print("Total tables extracted:", tables.n)

Total tables extracted: 1


In [10]:
# export individually
print(tables[0].df)

              0            1                2                     3  \
0  Cycle \nName  KI \n(1/km)  Distance \n(mi)  Percent Fuel Savings   
1                                                  Improved \nSpeed   
2        2012_2         3.30              1.3                  5.9%   
3        2145_1         0.68             11.2                  2.4%   
4        4234_1         0.59             58.7                  8.5%   
5        2032_2         0.17             57.8                 21.7%   
6        4171_1         0.07            173.9                 58.1%   

                   4                  5                 6  
0                                                          
1  Decreased \nAccel  Eliminate \nStops  Decreased \nIdle  
2               9.5%              29.2%             17.4%  
3               0.1%               9.5%              2.7%  
4               1.3%               8.5%              3.3%  
5               0.3%               2.7%              1.2%  
6          

In [11]:
tables[0].to_csv("foo.csv")

In [12]:
# or export all in a zip
tables.export("doo.csv", f="csv", compress=True)