# Imports
Please note that some imports are somewhat hacky to get to work and require some tinkering.

In [98]:
from transformers import pipeline # Handles summarization
import requests # Handles translation using the DeepL API
from typing import Optional

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\bytec\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


# Global variables and constants
The DeepL API key is used for the translation API. If the API key becomes invalid for some reason, you can generate your own API key following the instructions at https://www.deepl.com/docs-api.

In [99]:
# At this time of writing this (20.05.2023), the API had 450k characters still unused. Please keep this in mind and use the provided key responsibly.
DEEPL_API_KEY = '898523e2-0911-71ea-8d45-3e60991d2130:fx'
DEEPL_BASE_URL = 'https://api-free.deepl.com'

All model checkpoint layers were used when initializing TFBartForConditionalGeneration.

All the layers of TFBartForConditionalGeneration were initialized from the model checkpoint at facebook/bart-large-cnn.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBartForConditionalGeneration for predictions without further training.


# A short description of the summarization logic
Originally, the plan was to simply summarize the provided text natively in the language it was provided. There are plenty of examples of this available, such as the open-source Reddit bot "autotldr", which has a similar function.

Problems rose when it was determined that lots of slideshows have only bullet points, which isn't compatible with the style other similar projects use. Other projects use a pattern, where they extract the important sentences from the provided text without editing it. This falls apart with ours.

To bypass this problem, the `SummarizerPipeline` from the huggingface `transformers` library is used. By translating the source text to English and then summarizing it, we can bypass many of the issues that arise from the traditional summarization methods. This also makes it trivial to add additional languages, in fact by default all DeepL supported languages should theoretically be able to be summarized properly. Keep in mind that this is untested functionality and no guarantees are provided.

In [100]:
def translate_text(text: str, target_lang: str ='EN-GB', source_lang: Optional[str] = None) -> tuple[str, str]:
    """This function returns a tuple of (source_lang, translated_text)."""
    # Build the URL for the translation service
    url = f"{DEEPL_BASE_URL}/v2/translate"
    # Build the payload
    payload = { 'text': [text], 'target_lang': target_lang }
    # In case a manual source language is set, we should pass it along. Otherwise, DeepL will handle it for us
    if source_lang is not None:
        payload[source_lang] = source_lang
    # Headers
    headers = { 'Authorization': f"DeepL-Auth-Key {DEEPL_API_KEY}" }
    # Send the request
    response = requests.post(url, json=payload, headers=headers)
    json_response = response.json()
    # See the DeepL docs for the exact JSON format
    return json_response['translations'][0]['detected_source_language'], json_response['translations'][0]['text']

In [101]:
def summarize_text(text: str, language: Optional[str] = None, test=False) -> str:
    """This functions returns a summary of the provided text. If the source language is known, pass it in the `language`
        argument for a more accurate translation. For testing, please set test to True and pass in English text only."""
    # Get the translated text with its corresponding language
    source_lang = 'EN'
    translated_text = text
    if not test:
        source_lang, translated_text = translate_text(text, source_lang=language)
    # Summarize the text
    summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
    summarized_text_en = summarizer(translated_text, max_length=1024, min_length=500, do_sample=False, truncation=True)[0]['summary_text']
    # Get back the original language
    returnable_text = summarized_text_en
    if not test:
        _, returnable_text = translate_text(summarized_text_en, target_lang=source_lang, source_lang='EN')
    return returnable_text


In [103]:
summarize_text("The ancient Greeks knew of five planets in addition to the sun, moon and fixed stars aävnteg dotepeg wandering stars or simply rmhavfrai rovers Mercury, Venus, Mars, Jupiter, Saturn, as well as comets koufitng long-haul stars and meteors uer wpa celestial phenomenon, high in the air. The centre of all was the earth. Mikotaj copernik all orbits take place around the sun. Galileo galilei the first telescopic observations of the moons also orbiting jupiter the phases of the moon and the surface of the sun are not ideal. The idea that there is a system of celestial bodies clustered around the sun began to emerge in the century. In the year was a march exceptionally close to Earth at about a million kilometres or an honour. The Italian astronomer Giovanni Schiaparelli thought he could see waves, troughs or river courses, which he called canals. In English this was translated as canals, more correctly channels or grooves insismgat. Tnd ne. U.S. businessman percival lowell set up his private observatory in flagstaff, arizona, to study the martian. (See: Picture 0) The solar system as we know it today. (See: Picture 1) The Sun. A star of the spectral class mass kg mg radius km ry luminosity or radiant intensity. Visible surface or photosphere temperature temperature in the core ca million. (See: Figure 2) The Sun's main constituents are the nucleus, the radiation zone, the convection zone, the atmosphere, the photosphere, the chromosphere and the corona. The Sun is a common star among at least a billion that make up the Milky Way galaxy. The Sun and the solar system move around the centre of the Milky Way galaxy at a speed of about km/. (See: Figure 3) The Sun and the solar system move around the centre of the Milky Way at a speed of about km/. When viewed from the Earth or the Sun's North Pole, all planets orbit the Sun in the opposite direction to the direction of the clockwise motion of the Sun, most planets also rotate on their axis in the same direction. The planets of the solar system are divided into two classes terrestrial id terra, or earth type, and giant planets, or jovian id jove, the parallel name for Jupiter. Earth-type planets are close to the sun, while the giant planets are far apart from each other and the sun, au. Earth-like planets are small, dense, made up of rocks and metals giant planets are large, low-density, mainly. Terrestrial planets have a solid surface, carbon planets do not. Terrestrial planets have weak or no magnetic field, and giant planets have strong magnetic fields. Terrestrial planets rotate relatively slowly, giant planets fast. Terrestrial planets have a total of only a few moons, whereas carbon planets each have several dozen or more. Earth-like planets are also not very similar. They all have atmospheres, but very different Mercury is practically a vacuum, Venus many times denser than Earth. Only on Earth is there free oxygen in the atmosphere and liquid water on the surface. The surface structures are very different in the Mercury, where the unchanging surface is covered by numerous craters, as is the active volcanic activity on the Moon's water. Rotation speeds on their own axes are very different, with the Earth and Mars completing a full rotation in about an hour, Mercury in a day, Venus in a day, and vice versa to orbital motion. Earth and Mars have moons, Mercury and Venus do not. Mercury id mercurius god of merchants, finance, travellers etc. in ancient Rome. Mercury. Diameter km Earth its mass kg Earth its average density cm? Distance from the sun Eccentricity of orbit. Orbital period period of rotation of days the resonance between rotation and orbit for an observer on the surface of mercury is two years in one day. Temperature at the equator poles. Min. Mean. Max. (See: Figure 4) Venus id venus the goddess of love and beauty in the ancient Rome. Venus. Diameter km earth its mass kg earth its average density cm?. Distance from the sun honor orbital eccentricity. Orbital period days orbital period days. Average temperature. Atmospheric pressure at surface mpa times greater than on earth in composition co, ar, co, etc. (See: Figure 5) Earth, id terra, ingl earth, sks die erde, kr. Earth. Diameter average km mass kg average density cm?. Distance from the sun au orbital eccentricity. Orbital period days orbital period hours. Temperature. Min. Average max. Atmosphere ar co, aver. Earth's structure crust, interstitial belt, outer atmosphere, inner atmosphere. The magnetic field protects the Earth from electrically charged particles from solar wind and cosmic rays. (See: Figure 6) Moon ingl moon old english mõna proto germanic mõ8 so-called moon as a period of time id luna, kr oghfjvn selõne. Diameter km earth its mass kg earth its average density cm?. (See: Fig. 7) Rotation period days rotation period days. Temperature at the equator average at the poles average. (See: Figure 8) (See: Figure 9) Mars id mars ancient Roman god of war. Mars. Diameter km Earth's own mass kg Earth's own average density cm?. Distance from the sun au mean au orbital eccentricity. Orbital period days orbital period days. Temperature min. Average. Max. Atmospheric pressure at surface mean kpa ca of earth composition co, ar co. (See: Figure 10) (See: Figure 11) Phobos and deimos fear and horror. The planets of galaxies. Low-medium density, extensive atmospheres mainly he rotating rapidly strong magnetic field many moons on all rings, but saturnil most conspicuous. Jupiter and Saturn are similar gas giants to each other; Uranus and Neptune are similar ice giants to each other. Jupiter id üpiter ancient Roman chief god, god of heaven also jove. Jupiter. Diameter km the earth its mass kg the earth its mass times more than all the other planets combined average density cm?. Distance from the sun au mean au orbital eccentricity. Orbital period days rotation period. Temperature earth at atmospheric pressure average earth at atmospheric pressure average. Atmospheric pressure at surface average kpa ca of earth composition he ch, nh3 hd, hg, ho. Jupiter's internal structure. Jupiter's great red spot great red spot ca km. (See: Figure 12) Galilean moons detected. (See: Picture 13) Ganymede. Callisto. Europa. Active volcanoes. Liquid water under ice. Solar system's largest moon, km. Saturn id saturn ancient Roman god of agriculture, fertility, abundance. Saturn. Diameter km Earth its mass kg Earth its average density cm? water cm?!. Distance from the sun au mean au orbital eccentricity. Orbital period period of days rotation period. Temperature earth at atmospheric pressure mean earth at atmospheric pressure mean. Atmospheric pressure at surface aver kpa composition he ch, nh3 hd, hg. Saturn's internal structure and rings. Automated satellite cassini's last photo of the saturn family, published in november after cassini had entered saturn's atmosphere in september. (See: Figure 14) Uran id uranus kr oüpavõg sky god. The first planet discovered in the telescopic era, March. William herschel. Uranus. Diameter km earth its mass kg earth its mean density cm?. Distance from the sun au mean au orbital eccentricity. Orbital period days rotation period. Mean. Temperature. Atmospheric pressure changes from ca mpa to upa composition he ch, ice nh, nh, incl. Uranium rings and moons. Neptune id neptune of the ancient Roman sea god. Discovered on Sept. (See: Figure 15) (See: Figure 16) Neptune. Diameter km Earth's own mass kg Earth's own mean density cm?. Distance from the sun au mean au orbital eccentricity. Orbital period period of days rotation period. Temperature earth at atmospheric pressure mean earth at atmospheric pressure mean. Atmospheric pressure varies within large limits composition he ch, zt ice nm, nh, incl. (See: Figure 17) Triton orbits the Neptune in the opposite direction to its rotation. Pluto id plütõ, kr aoürwv god of the underworld or hades in classical mythology. Clyde tombaugh discovered pluto on february at the observatory that percival lowell had founded to study the martian channels. Pluto. Diameter km mass kg ore average density cm?. Distance from the sun au mean au orbital period years days orbital eccentricity of inclination to ecliptic. Companion charon. (See: Figure 18) Other companions. (See: Figure 19) Dwarf planets. Dwarf planets. Dwarf dwarfs, orbiting around a star in the Sun, are massive enough to be in hydrostatic equilibrium under the influence of gravity, more or less spherical in shape, have not been able to clear their immediate surroundings of smaller bodies. Currently on the official list of dwarf planets. Ceres pluto haumea makemake eris. Discovered. On a dwarf planet. Mass approximately greater than that of pluto. There are probably hundreds of celestial bodies beyond Neptune's orbit that could turn out to be dwarf planets, some larger than pluto. Smaller ones of about km in size may be found above. Tno trans Neptunian objects. The glory of Kuiper's belt. Scattered disk glory. Earth objects detached objects au. Kuju moon fiona ji tcyn2 fa an eitan ki is cr, ait äike ki ge sedna na amaan distant trans neptunian objects kuiper diffuse disk detached objects belt. Scattered disc. Earth objects. Further away, ca au, probably a cloud of comets, usually called the orb cloud, but whose existence was first suggested by ernst öpik in öpik's orb cloud. Ernst julius öpik Estonian astronomer. (See: Picture 20) (See: Picture 21) Comets. The nucleus. Comet. Tail of the Ionic Tail and the Tufted Tail. Vr ng the nucleus of the comet churyumov gerasimenko rather icy dirtball. As the comet nucleus approaches the sun ca au ice and other solid particles begin to sublimate from solid to gaseous state to form a dust cloud of coma gas, dimensions can reach tens and hundreds of thousands of kilometres. The solar wind starts to push particles out of the coma the ion tail or gas tail is always directed away from the sun the dust tail the heavier particles are slightly deflected eea cii dust tail dusttail gas you dust trail. The solar wind starts to push the particles out of the coma. The dust tail or gas tail always points away from the sun. The orbits of comets. Short periodic. Encke type. Closer than the orbit of Jupiter, a few years, close to the ecliptic plane. Jupiter family. Around Jupiter's orbit, less than a year, may orbit at a high angle to the ecliptic plane. Descend from the Centaurs between the orbits of Saturn and Neptune, and the Kuiper belt, years. Halley type. Periodic. Typical orbits of short-period and long-period comets. (See: Figure 22) Asteroids or minor planets. Orbiting the Sun mostly between the orbits of Mars and Jupiter in the vicinity of the asteroids. See Figure 22. They are also known as Trojan Jupiters. Trojans are about one degree in front of and behind Jupiter. Near-Earth periheel au apollo amor and ateena type. (See: Figure 23) The first and largest km asteroid ceres was discovered by giuseppe piazzi in January. Ceres is now considered a dwarf planet. The main types of asteroids by chemical composition and reflectivity. Carbonaceous carbon types, the darkest. Ca. Silicate silicon types. Ca. Ca. Metallic metallic ni, fe. Eros vesta. Meteoroid interplanetary celestial body with dimensions um mm. Meteoroid a meteoroid or small asteroid entering the Earth's atmosphere and exploding, popularly a shooting star larger ones are called boloids or fireballs. A meteorite is a meteoroid or asteroid that has reached the Earth's surface. Rock iron and mixed meteorites. (See: Picture 24) (See: Picture 25) (See: Picture 26) (See: Picture 27) Our solar system is billions of years old. Mass total solar mass. The next closest planetary system is at the proxima of the star Centauri, a light year away. (See: Picture 28)", test=True)

All model checkpoint layers were used when initializing TFBartForConditionalGeneration.

All the layers of TFBartForConditionalGeneration were initialized from the model checkpoint at facebook/bart-large-cnn.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBartForConditionalGeneration for predictions without further training.


[{'summary_text': "The Sun is a common star among at least a billion that make up the Milky Way galaxy. The planets of the solar system are divided into two classes terrestrial id terra, or earth type, and giant planets, or jovian id jove, the parallel name for Jupiter. Earth-type planets are close to the sun, while the giant planets are far apart from each other and the sun. Only on Earth is there free oxygen in the atmosphere and liquid water on the surface. Mercury is practically a vacuum, Venus many times denser than Earth. The Moon protects the Earth from electrically charged particles from the solar wind and cosmic rays. The Earth and Mars have moons, Mercury and Venus do not. The Sun and solar system move around the centre of the MilkyWay galaxy at a speed of about km/. The surface structures are very different in the Mercury, where the unchanging surface is covered by numerous craters, as is the active volcanic activity on the Moon's water. Rotation speeds on their own axes are