Skip to content

Oleksandra2020/TED_Search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ukrainian TED search engine

Description

This project is about searching for TED conferences in Ukrainian. It uses linked list data structure as well as Flask framework. It is already on tedsearch.pythonanywhere.com, so feel free to try it out!

Table of contents

Homework0 part1

Homework0 part2

Homework1

Homework2

Homework3

Homework4

Homework5

Installation

You do not need to install anyhting, if you use this project on pythonanywhere.com, but if you want to run it locally, you will need to install these libraries and API:

pip install flask
pip install pandas
pip install youtube-transcript-api
pip install stop-words

Also, if you run the program locally, you will need to run the data_collector module before running the web app. It will download, clean and save the subtitles to the 'data' folder, which I have not uploaded to this repository.

After that you can run a flask_app module and enjoy the videos!

Example of usage

First, choose a topic you want to find

Here you are! You can watch the video directly on the site or watch it on ted.com (the link is after the title). At the bottom, you can find a short description of a conference. Beside, you can find youtube hyperlinks to the videos on similar topic.

You can always return to the initial page by clicking the button in the top left corner.

Keep in mind that the talks are up to 2017. Also, you will have a chance to make another input if there are no talks found or you have made a mistake entering the previous one.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

  1. Fork the repo on GitHub
  2. Clone the project to your own machine
  3. Commit changes to your own branch
  4. Push your work back up to your fork
  5. Submit a Pull request so that I can review your changes

Current version of repository includes:

  • examples folder:
    • an example of using youtube-transcripts-api
    • json file the module in the folder returns
    • example of retrieving Ukrainian stop words using stop_words library
  • modules folder:
    • csv_reader: processes csv data necessary for the research
    • data_collector: retrieves data from api and saves it in json-format in data folder
    • flask_app
    • node_: representation of linked list
    • normalizer: removes stop words and stems the given words
    • search: representation of SearchADT, which finds relevant videos, adds details and gets the number of video(s) with the same translator
    • templates folder
  • example_data: a few examples of processed data that I have used for the project
  • docs:
    • apart from pdf documents with homeworks, in the build there is a autogenerated documentation for my project
  • screens

License

MIT

Credits

Oleksandra 🦋