# Python Technologist Application Test

## Problem 1


[This is](https://ar.wikipedia.org/wiki/%D9%82%D8%A7%D8%A6%D9%85%D8%A9_%D8%A3%D9%81%D8%B6%D9%84_%D9%85%D8%A6%D8%A9_%D8%B1%D9%88%D8%A7%D9%8A%D8%A9_%D8%B9%D8%B1%D8%A8%D9%8A%D8%A9) a Wikipedia webpage listing the 100 best Arabic novels according to the Arab Writers Union. In case it did not open for any reason try [this](https://www.marefa.org/%D9%82%D8%A7%D8%A6%D9%85%D8%A9_%D8%A3%D9%81%D8%B6%D9%84_%D8%A7%D9%84%D9%83%D8%AA%D8%A8_%D8%A7%D9%84%D8%B9%D8%B1%D8%A8%D9%8A%D8%A9) alternative link for the same info 

Using Python, do the following:

1. Scrap the webapge to get the books table and write it to excel file, Keeping all the content from the HTML table including Hyper-links if any.
2. Write the content to a Google sheet 



Write your code in the following cell. You are free to add cells as much as you need.

In [None]:
# the script located on root dire standalone/excel_script.py
# for testing run it from its location or from the views on deployed project
# for more details https://scrapingwiki.herokuapp.com/


import requests
import pandas as pd
from bs4 import BeautifulSoup as bs

domain = 'https://ar.wikipedia.org'

url = "https://ar.wikipedia.org/wiki/%D9%82%D8%A7%D8%A6%D9%85%D8%A9_%D8%A3%D9%81%D8%B6%D9%84_%D9%85%D8%A6%D8%A9_%D8%B1%D9%88%D8%A7%D9%8A%D8%A9_%D8%B9%D8%B1%D8%A8%D9%8A%D8%A9"

response = requests.get(url)


soup = bs(response.content, features='html.parser')

table = soup.select('table.wikitable')[0]

columns = [i.get_text(strip=True) for i in table.find_all("th")]

columns += ["رابط الكتاب", "رابط المؤلف", "رابط البلد"]

data = []

for tr in table.find("tbody").find_all("tr"):
    cells = []
    tds = tr.find_all('td')
    link=[]

    for td in tds:
        cells.append(td.get_text(strip=True))
        if td.find('a'):
            link.append(domain + td.find('a')['href'])
    data.append(cells + link)


df = pd.DataFrame(data, columns=columns)

df.to_excel("data.xlsx", index=False)


dict_data = pd.read_excel("data.xlsx")
rec = dict_data.to_dict("index")

import json

final = json.dumps(rec, ensure_ascii=False).encode('utf8')
print(final)
# with open(r'test.txt', 'w') as fp:
#     fp.write(str(rec))

## Problem 2 

Create REST APIs in Python using Flask to read (Get) and write (Post, Delete, Put) the local excel file from the previous problem. Please make sure to bundle all the API dependencies to be uasble. Deploying the API to Heroku would be a big plus.

Write your code in the following cell. You are free to add cells as much as you need.

In [None]:
# this is deployed django & DRF on heroku links
# all Details about it on Report Page p2
# https://scrapingwiki.herokuapp.com/

## Problem 3

Write a tool to create a PDF cover for the books, following the attached example "book-cover-sample.pdf". Considering the following:

    1. The QR code should embed the book hyperlink from Wikipedia.
    2. The QR code should be clickable to let the users acess it by clicking on it.
    3. Include all the covers in one directory and compress it in ZIP format.
    

Write your code in the following cell. You are free to add cells as much as you need.

In [None]:
# script located at root dir /stand_alone_scripts/pdf_cover_generator_tool
# for testing run it from its location or from the views on deployed project
# for more details https://scrapingwiki.herokuapp.com/

from django.conf import settings

import reportlab
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib import utils
from reportlab.lib.enums import TA_CENTER
from reportlab.pdfgen.canvas import Canvas
from reportlab.platypus import (Paragraph, Frame, Image,)
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont

from bidi.algorithm import get_display
import pandas as pd
import qrcode
import arabic_reshaper
import os
import re

base_dir = os.path.dirname(__file__)
excel_file_path = os.path.join(os.path.dirname(__file__), "data.xlsx")


def image_creator(book_name, book_link):

    qr = qrcode.QRCode(
        version=1,
        box_size=10,
        border=5
    )

    qr.add_data(book_link)

    qr.make(fit=True)
    img = qr.make_image(fill_color='blue', back_color='white')
    img.save('{}/qr_images/'.format(base_dir) + str(book_name) + '.png')######
    return img


reportlab.rl_config.TTFSearchPath.append('{}/fonts/'.format(base_dir))
pdfmetrics.registerFont(TTFont(
    'KFGQPC Uthman Taha Naskh Regular',
    '{}/fonts/Traditional_Arabic.ttf'.format(base_dir)))


def get_image(path, width=1):
    img = utils.ImageReader(path)
    iw, ih = img.getSize()
    aspect = ih / float(iw)
    return Image(path, width=width, height=(width * aspect))


def create_pdf(author_name, book_name, book_link):
    # *
    pdf = Canvas("{}/books_covers/{}.pdf".format(base_dir, book_name))
    image_frame = Frame(100, 300, 400, 400, showBoundary=0, leftPadding=1, rightPadding=1, bottomPadding=1,
                        topPadding=1)

    text_frame = Frame(150, 150, 300, 150, showBoundary=0, leftPadding=1, rightPadding=1, bottomPadding=1, topPadding=1)

    # *
    directory = '{}/qr_images/'.format(base_dir)

    # * save image to directory
    image_creator(book_name=book_name, book_link=book_link)

    code = []
    code.append(get_image(directory + book_name + '.png', width=150))

    # * add image to the frame
    image_frame.addFromList(code, pdf)

    # * Drawing the image
    pdf.drawInlineImage(directory + book_name + '.png', 100, 300, 400, 400)

    # * add link to the frame (rectangle)
    pdf.linkURL(book_link, rect=(145, 335, 455, 655), relative=5)

    text_list = []
    styles = getSampleStyleSheet()
    style = styles['Title']

    text_1 = arabic_reshaper.reshape(u"{}".format(book_name))
    text_1 = get_display(text_1)
    text_2 = arabic_reshaper.reshape(u"{}".format(author_name))
    text_2 = get_display(text_2)
    text = re.sub(r'\n', '<br/>', (text_1 + '\n\n' + text_2))

    text_list.append(Paragraph(text, ParagraphStyle(
        name='', fontName='KFGQPC Uthman Taha Naskh Regular', fontSize=38,
        textColor='black', alignment=TA_CENTER), encoding='utf8'))

    # * add text to the new frame
    text_frame.addFromList(text_list, pdf)

    pdf.save()

    return pdf


def generate_books_cover(excel_path):
    data = pd.read_excel(excel_path).dropna()

    records = data.to_dict("index")
    keys = [key for key in records][1:]
    for key in keys:
        author_name = records[key]['المؤلف']
        book_name = records[key]['الرواية']
        book_link = records[key]['رابط الكتاب']
        create_pdf(author_name, book_name, book_link)


generate_books_cover(excel_file_path)