<a href="https://colab.research.google.com/github/ChanJianHao/SubMe/blob/master/SubMe.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# SubMe

Github URL: [SubMe](https://github.com/HandsomeWJ/SubMe/blob/master/SubMe.ipynb)

Cloud-based machine translation for your media files! No more language barriers when you have got SubMe!


**Instructions** 
1. Review 'Project Setup' settings - we recommend to leave values as default if you do not understand what they do.
1. Choose your preferred processing method under `processMethod` and then press `Ctrl+F9` to run all. 


# Types of `processMethod`

1. '**Upload from Local**' - Upload your video file from your computer.

1. '**Upload from Google Drive**' - you will first have to authenticate your Google Drive account. You will then be prompted to enter the desired number of files to sub. Lastly, you will have to enter the paths for each of these files. The paths can be obtained by right-clicking on the file name from the panel on the left.

1. '**Download with magnet**' - you will be prompted to enter a magnet link for every torrent you wish to download. Once you have entered all your magnet links, type `exit` and the downloads will begin.

1. '**Download with custom command**' For advanced users only, use custom command to such as curl to download your media files for processing.



# Project Setup

It may take a few minutes to load the program. Please be patient.

In [None]:
#@title Project Settings

import os

#@markdown Repo path to clone from.
repo_url = 'https://github.com/BingLingGroup/autosub' #@param {type:"string"}
repo_name = repo_url.split('/')[-1]

#@markdown Repo branch or tag to clone. Leave blank to clone `master`.
repo_branch = 'dev' #@param {type:"string"}

#@markdown Root workspace.
root_dir = '/content/subme' #@param {type:"string"}
repo_dir = os.path.join(root_dir, repo_name)

#@markdown Attempt to request Google Drive access. Will mount to `/content/drive/My Drive`. 
#@markdown Will save output to your linked Google Drive account inside `subme` folder. 

#@markdown Attempt to invoke high memory (25.5 GB) in Colab.
high_memory = False #@param {type:"boolean"}

#@markdown Choose `Upload from Local` if you are uploading a file from your own computer. 
#@markdown Choose `Upload from Google Drive` if you are uploading a file from Google Drive.
#@markdown Choose `Download` to download a torrent file from a magnet.

processMethod = "Upload from Google Drive" #@param ["Upload from Local", "Upload from Google Drive", "Download with magnet", "Download with custom command"]




In [None]:
#@title Perform Google authentication
#@markdown Only necessary for "Upload from Google Drive" or "Download with magnet"

from google.colab import drive

if processMethod == "Upload from Google Drive" or processMethod == "Download with magnet":
  drive.mount('/content/drive')

# Additional Setup

Ignore this, for advanced users only.



In [None]:
#@title System Setup
#@markdown Attempt to invoke high memory if enabled in settings.

if high_memory:
  !pip install gputil

  # Import packages
  import os,sys,humanize,psutil,GPUtil

  def mem_report():
    print("CPU RAM Free: " + humanize.naturalsize( psutil.virtual_memory().available ))
    
    GPUs = GPUtil.getGPUs()
    for i, gpu in enumerate(GPUs):
      print('GPU {:d} ... Mem Free: {:.0f}MB / {:.0f}MB | Utilization {:3.0f}%'.format(i, gpu.memoryFree, gpu.memoryTotal, gpu.memoryUtil*100))
      
  mem_report()

  if psutil.virtual_memory().available < 13958643712:
    print('Attempting to invoke high memory.')
    print('This notebook will crash intentionally and Colab should display a prompt to offer you high-RAM.')
    print('IF THIS PROMPT DOES NOT SHOW, DISABLE high_memory IN THE SETTINGS!')
    d=[]
    while(1):
      d.append('1')



In [None]:
#@title Project Setup
#@markdown Setup directories. Clone project from git.

output_dir = ('/content/drive/My Drive/subme')

os.makedirs(root_dir, exist_ok=True)

if not os.path.isdir(repo_dir):
  repo_cmd = f'--branch {repo_branch}' if repo_branch else ''
  !git clone {repo_url} {repo_cmd} --depth 1 "{repo_dir}"

In [None]:
#@title Dependency Setup
#@markdown Install dependencies from `requirements.txt`.

!apt install ffmpeg python3 curl git -y
!curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
!python3 get-pip.py
!pip install git+{repo_url}@{repo_branch} ffmpeg-normalize langcodes

# Gathering Videos for Processing

In [None]:
#@#title Upload Videos for Processing
#@markdown When uploading from Local, click on 'Choose File' and navigate to the
#@markdown the directory on your computer that contains the file you wish to upload.
#@markdown You can select multiple files.

#@markdown When uploading from Google Drive, you will first have to specify how many
#@markdown files you wish to upload. Then you will have to enter the complete file path
#@markdown for each of these files. The file path can be obtained by right-clicking on
#@markdown the file name from the panel on the left and selecting 'Copy Path'.

#@markdown When downloading from magnet links, please make sure that only magnet links 
#@markdown are entered. Magnet links are links which begin with `magnet:`.

#@markdown If you see an error in this section, then run this cell again. 
#@markdown Ensure you have picked the correct processing method from settings above.
from google.colab import files
import shutil

if processMethod == "Upload from Local":
  mediaList = files.upload()
  mediaList = list(mediaList.keys())
  for i in range(len(mediaList)):
    file_name = mediaList[i]
    mediaList[i] = "/content/" + file_name
  

elif processMethod == "Upload from Google Drive":
  mediaList = []
  file_no = int(input("No. of files to sub: "))
  for i in range(file_no):
    file_name = input("Please enter file path: ")
    mediaList.append(file_name)

elif processMethod == "Download with magnet":
  mediaList = []
  !apt install python3-libtorrent
  import libtorrent as lt

  ses = lt.session()
  ses.listen_on(6881, 6891)
  downloads = []

  params = {"save_path": "/content/drive/My Drive/Torrent"}

  while True:
      magnet_link = input("Enter Magnet Link(s) Type exit when done: ")
      if magnet_link.lower() == "exit":
          break
      downloads.append(
          lt.add_magnet_uri(ses, magnet_link, params)
      )
  
  # Start Download 

  import time
  from IPython.display import display
  import ipywidgets as widgets

  state_str = [
      "queued",
      "checking",
      "downloading metadata",
      "downloading",
      "finished",
      "seeding",
      "allocating",
      "checking fastresume",
  ]

  layout = widgets.Layout(width="auto")
  style = {"description_width": "initial"}
  download_bars = [
      widgets.FloatSlider(
          step=0.01, disabled=True, layout=layout, style=style
      )
      for _ in downloads
  ]
  display(*download_bars)

  while downloads:
      next_shift = 0
      for index, download in enumerate(downloads[:]):
          bar = download_bars[index + next_shift]
          if not download.is_seed():
              s = download.status()

              bar.description = " ".join(
                  [
                      download.name(),
                      str(s.download_rate / 1000),
                      "kB/s",
                      state_str[s.state],
                  ]
              )
              bar.value = s.progress * 100
          else:
              next_shift -= 1
              ses.remove_torrent(download)
              downloads.remove(download)
              bar.close() # Seems to be not working in Colab (see https://github.com/googlecolab/colabtools/issues/726#issue-486731758)
              download_bars.remove(bar)
              print(download.name(), "complete")
              mediaList.append("/content/drive/My Drive/Torrent/" + download.name())
      time.sleep(5)
  
elif processMethod == "Download with custom command":
  mediaList = []

  customCommand = input("Please enter command you want to execute: ")
  !{customCommand}
  

  #scan for video files in cwd
  for file in os.listdir("/content"):
    if (file.endswith(".mp4") or file.endswith(".mkv") or file.endswith(".webm") 
    or file.endswith(".avi") or file.endswith(".flv") or file.endswith(".mov")
    or file.endswith(".wmv") or file.endswith(".ogg")or file.endswith(".m4v")):
        print("Found video: " + os.path.join("/content", file))
        mediaList.append(os.path.join("/content", file))

# Video Processing

In [None]:
#@markdown Choose media source and destination language.
#@markdown Refer to langcode website [here](http://www.lingoes.net/en/translator/langcode.htm).

src_language = 'ja' #@param {type:"string"}
dst_language = 'en' #@param {type:"string"}
output_format = 'srt'  #@param ["srt", "ass", "sub", "json", "txt"]

for file in mediaList:
  print('User mediaList file "{name}" '.format(
      name=file))
  !autosub -i "{file}" -S {src_language} -D {dst_language} -F {output_format} -o "{file}".{output_format}



# Download your subtitles

In [None]:
#@markdown Download your subtitles to your PC.

for file in mediaList:
  files.download(str(file) + "." + dst_language + "." + output_format)