# 同步`github`库 

colab-misc-utils(clmutils)用法示例

示例在colab的笔记本里运行。但做法可以扩展到kaggle或其他Linux、Windows机器。


##安装`clmutils`库导入

先用`pip`安装`clmutils`库及导入所需要的方法。（也可以用git clone安装。）

In [None]:
from pathlib import Path

try:
    from clmutils import create_file, append_content
except ModuleNotFoundError:
    # !pip install -Uq clmutils
    !pip install clmutils==0.1.1
    from clmutils import create_file, append_content

## 生成密钥文件及设置github.com config 

将对应 github公钥([https://github.com/settings/keys](https://github.com/settings/keys))的密钥贴在下面赋值给gh_key。这里的密钥是github示例用户clmutils的有效密钥。一路点击下去就可以顺利运行演示。你在自己的colab里运行时则需要换成自己的github账号公钥([https://github.com/settings/keys](https://github.com/settings/keys))对应的密钥（典型为id_rsa, id_dsa，id_ecdsa, id_ed25519，但可以重新命名为任何其他文件名）。



In [None]:
gh_key = \
"""-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAAMwAAAAtzc2gtZW
QyNTUxOQAAACCZC8DOrHpisdDq66KFCniUxdUG0e3VO/x+LDW3u3BBPQAAAJBmk3z1ZpN8
9QAAAAtzc2gtZWQyNTUxOQAAACCZC8DOrHpisdDq66KFCniUxdUG0e3VO/x+LDW3u3BBPQ
AAAEAKaAKiJSYphKjds5DFaKPdxaIVV6kTs4icy2F+VTxpkZkLwM6semKx0OrrooUKeJTF
1QbR7dU7/H4sNbe7cEE9AAAAC2NsbXV0aWxzLWdoAQI=
-----END OPENSSH PRIVATE KEY-----
"""
# 这里的密钥是个有效的示例账号密钥。请不要用于其他用途。


运行`create_file`将 `gh_key`写入`~/.ssh/gh-key`并设置好权限(目录不存在时`create_file`会开一个目录)。这里的`~`指主目录home
。

In [None]:
create_file(gh_key, dest="~/.ssh/gh-key")

# create_file不会覆盖已经存在的文件。如想覆盖，
# 可以加overwrite=True，例如
# create_file(gh_key, dest="~/.ssh/gh-key", overwrite=True)

PosixPath('/root/.ssh/gh-key')

**`create_file` 其实是实现了下面的bash指令**
```bash
%%bash
mkdir -p ~/.ssh
cat > ~/.ssh/gh-key <<EOL
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAAMwAAAAtzc2gtZW
QyNTUxOQAAACCZC8DOrHpisdDq66KFCniUxdUG0e3VO/x+LDW3u3BBPQAAAJBmk3z1ZpN8
9QAAAAtzc2gtZWQyNTUxOQAAACCZC8DOrHpisdDq66KFCniUxdUG0e3VO/x+LDW3u3BBPQ
AAAEAKaAKiJSYphKjds5DFaKPdxaIVV6kTs4icy2F+VTxpkZkLwM6semKx0OrrooUKeJTF
1QbR7dU7/H4sNbe7cEE9AAAAC2NsbXV0aWxzLWdoAQI=
-----END OPENSSH PRIVATE KEY-----
EOL
chmod 600 ~/.ssh/gh-key

```

可以看一下~/.ssh/gh-key的权限（必须600或go-rwx或最后6位为------）和里面的内容

In [None]:
!ls -l ~/.ssh/gh-key
print(Path('~/.ssh/gh-key').expanduser().read_text('utf8'))

-rw-r--r-- 1 root root 399 Dec 20 01:36 /root/.ssh/gh-key
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAAMwAAAAtzc2gtZW
QyNTUxOQAAACCZC8DOrHpisdDq66KFCniUxdUG0e3VO/x+LDW3u3BBPQAAAJBmk3z1ZpN8
9QAAAAtzc2gtZWQyNTUxOQAAACCZC8DOrHpisdDq66KFCniUxdUG0e3VO/x+LDW3u3BBPQ
AAAEAKaAKiJSYphKjds5DFaKPdxaIVV6kTs4icy2F+VTxpkZkLwM6semKx0OrrooUKeJTF
1QbR7dU7/H4sNbe7cEE9AAAAC2NsbXV0aWxzLWdoAQI=
-----END OPENSSH PRIVATE KEY-----



In [None]:
!chmod go-rwx ~/.ssh/gh-key
!ls -l ~/.ssh/gh-key

-rw------- 1 root root 399 Dec 20 01:36 /root/.ssh/gh-key


下一步是用`append_content`设置 `~/.ssh/config`里有关github.com的部分。

In [None]:
config_github_entry = \
"""
Host github.com
   HostName github.com
   User git
   IdentityFile ~/.ssh/gh-key
"""
append_content(config_github_entry, dest="~/.ssh/config")

# `append_content`的功能也可以在bash里实现

PosixPath('/root/.ssh/config')

同样可以打印出里面的内容看看

In [None]:
print(Path('~/.ssh/config').expanduser().read_text('utf8'))



Host github.com
   HostName github.com
   User git
   IdentityFile ~/.ssh/gh-key



## 检查设置的有效性
运行`!ssh -T git@github.com`看一下是不是设置好了。如果输出是`Host key verification failed.`说明设置上有问题。如果输出是`Hi clmutils!....`说明密钥设置及~/.ssh/config里有关 github的部分没问题了。

第一次运行 ssh到一个主机时，系统会问需不需要将主机的公钥写到~/.ssh/known_hosts文件里。colab没有互动，所以运行 ssh-keyscan github.com >> /root/.ssh/known_hosts

或在 ssh 时加上 `ssh -o StrictHostKeyChecking=no`

In [None]:
# !ssh-keyscan github.com > /root/.ssh/known_hosts
# !cat /root/.ssh/known_hosts

In [None]:
#  ssh -T git@github.com
# 如果不成功可以后面加 -v 或 -vv 等看看错误信息是什么

!ssh -o StrictHostKeyChecking=no -T git@github.com

Hi clmutils! You've successfully authenticated, but GitHub does not provide shell access.


## 设置与git相关部分
相关的git指令为
```bash
git config --global
```
clmutils用的电邮地址是 `colab.misc.utils@gmail.com`，
用户名是`clmutils`。设置好顺便列出来`--list`检查一下。

**同样，在运行你自己的colab时这里需换成你自己`github`账号的信息。**

In [None]:
%%bash
git config --global user.email colab.misc.utils@gmail.com
git config --global user.name clmutils
git config --global --list

user.email=colab.misc.utils@gmail.com
user.name=clmutils


## git clone/push 自己的库

我们就以clmutils库为例。访问库的主页[https://github.com/clmutils/colab-misc-utils](https://github.com/clmutils/colab-misc-utils)，点击Code，再选**`SSH`**(HTTPS地址不适合这里的公钥方法)，拷出地址：[git@github.com:clmutils/colab-misc-utils.git](git@github.com:clmutils/colab-misc-utils.git)用在`git clone`里。

In [None]:
%%bash
cd /content
git clone git@github.com:clmutils/colab-misc-utils.git
cd colab-misc-utils
pwd
ls

/content/colab-misc-utils
clmutils
poetry.lock
pyproject.toml
README.md
requirements-dev.txt
requirements.txt
run-poetry-export-requirements-dev.bat
tests


Cloning into 'colab-misc-utils'...


改变文件或生成新文件(touch data.txt模拟)，再 `git push`到库里

In [None]:
%%bash
cd /content/colab-misc-utils
touch data.txt
git commit -am "update clmutils test data.txt"
git push

[master 1b28fb5] update clmutils test data.txt
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 data.txt


To github.com:clmutils/colab-misc-utils.git
   8622e11..1b28fb5  master -> master


可以看到 data.txt 被成功push到库里 [https://github.com/clmutils/colab-misc-utils](https://github.com/clmutils/colab-misc-utils)。可能需要刷新一下网页。

我们删掉 data.txt 再更新库。

In [None]:
%cd /content/colab-misc-utils/
!rm data.txt
!git add .
!git commit -m "update test delete data.txt"
!git push
!ls

/content/colab-misc-utils
[master 1395b80] update test delete data.txt
 1 file changed, 0 insertions(+), 0 deletions(-)
 delete mode 100644 data.txt
Counting objects: 2, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (2/2), 235 bytes | 235.00 KiB/s, done.
Total 2 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), completed with 1 local object.[K
To github.com:clmutils/colab-misc-utils.git
   1b28fb5..1395b80  master -> master
clmutils	README.md	      run-poetry-export-requirements-dev.bat
poetry.lock	requirements-dev.txt  tests
pyproject.toml	requirements.txt


--完--


# 逆向ssh隧道（未完成）

# 测试写库中……

(clmutils的反向ssh隧道功能等实现后再演示。在colab里建好反向ssh隧道后，另一个机器可以ssh登录到colab机器里。)

In [None]:
from clmutils import chmod600
fpath = "/root/.ssh/gh-key"
fpath = Path(fpath)
fpath.exists()
_ = fpath.stat().st_mode
display(oct(_))
fpath.chmod(0o666)
display(oct(fpath.stat().st_mode))
chmod600(fpath)
display(oct(fpath.stat().st_mode))

'0o100600'

'0o100666'

[I 201220 01:36:58 chmod600:22] /root/.ssh/gh-key mode set to 0o100600


'0o100600'

In [None]:
!ls ~/.ssh


authorized_keys  config  gh-key  id_ed25519  id_ed25519.pub  known_hosts


In [None]:
# !ssh-keygen -q -t ed25519 -N "" -C "colab-key" -f ~/.ssh/id_ed25519 <<< y

import subprocess as sp
from shlex import split
from logzero import logger

In [None]:
# !ssh-keygen -q -N "" -C "colab-key" -f ~/.ssh/id_rsa
# !rm ~/.ssh/id_rsa

cmd = split('ssh-keygen -t ed25519 -N "" -C "colab-key" -f /root/.ssh/id_ed25519')
# /root/.ssh/id_ed25519 /root/.ssh/id_ed25519.pub

try: 
    _ = sp.check_output(cmd, encoding='utf8', stderr=sp.STDOUT)
    # print(_)
except Exception as e:
    # print(e)
    logger.debug(e.output.splitlines()[:-1])
    # logger.debug('already exists' in e.output)
    # print(e.returncode)
_ = Path("~/.ssh/id_ed25519.pub").expanduser().read_text("utf8").strip()
logger.info("colab public key:\n%s", _)
_ = "copy and paste colab public key to ~/.ssh/authorized_keys"\
    "\n in the computer you want to access colab from"
logger.info("\n%s", _)

[I 201220 01:36:58 <ipython-input-17-ef8eb37980dd>:16] colab public key:
    ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIL4rtBCqrjY96tTzBEwKmLot549jgvjPU2hV4hYEZSfJ colab-key
[I 201220 01:36:58 <ipython-input-17-ef8eb37980dd>:18] 
    copy and paste colab public key to ~/.ssh/authorized_keys
     in the computer you want to access colab from


In [None]:
remote_pubkey = input("Paste the publib key (typically ~/.ssh/id_rsa.pub)\n of your computer here: ")

Paste the publib key (typically ~/.ssh/id_rsa.pub)
 of your computer here: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOmoipXu7zLahIFRQXcPlYWXfvn/gytrQqzIG7eHA4yv root@acone3


In [None]:
print(remote_pubkey)

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOmoipXu7zLahIFRQXcPlYWXfvn/gytrQqzIG7eHA4yv root@acone3


In [None]:
# remote_pubkey
_ = append_content(remote_pubkey, "~/.ssh/authorized_keys")

In [None]:
_ = Path('~/.ssh/authorized_keys').expanduser().read_text("utf8")
print(f"the pub key of your computer is:\n {_}")

the pub key of your computer is:
 
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOmoipXu7zLahIFRQXcPlYWXfvn/gytrQqzIG7eHA4yv root@acone3
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOmoipXu7zLahIFRQXcPlYWXfvn/gytrQqzIG7eHA4yv root@acone3


In [None]:
# !cat ~/.ssh/authorized_keys
!ls ~/.ssh

authorized_keys  config  gh-key  id_ed25519  id_ed25519.pub  known_hosts


In [None]:
_ = """
%%bash
# sudo apt update
apt install openssh-server 
apt install autossh
/etc/init.d/ssh start
autossh -M 0 -f 216.24.255.63 -CN -R 2222:127.0.0.1:22
# """

import subprocess as sp
from shlex import split
def run_cmd(cmd):
    cmd = split(cmd)
    try:
        sp.check_output(cmd, stderr=sp.STDOUT, encoding="utf8")
    except Exception as e:
        print(e.output)
        print(e.returncode)
run_cmd("apt install openssh-server")
run_cmd("apt install autossh")
run_cmd("/etc/init.d/ssh start")
run_cmd("autossh -M 0 -f 216.24.255.63 -CN -R 2222:127.0.0.1:22 -o StrictHostKeyChecking=no")


In [None]:
# run_cmd("pkill autossh")
# run_cmd("autossh -M 0 -f 216.24.255.63 -CN -R 2222:127.0.0.1:22 -o StrictHostKeyChecking=no")
# !autossh -M 0 216.24.255.63 -CN -R 2222:127.0.0.1:22 -o StrictHostKeyChecking=no 

In [None]:
%%bash
# autossh -M 0 -f 216.24.255.63 -CN -R 2222:127.0.0.1:22
# kill -9 7363
# pkill autossh
ps aux|grep autossh|grep -v defunc|grep -v grep
ps aux|grep sshd|grep -v defunc|grep -v grep

# ssh 216.24.255.63 -CN -R 2223:127.0.0.1:22 -o StrictHostKeyChecking=no 

root        1389  0.0  0.0  34208  2336 ?        Ss   01:58   0:00 /usr/lib/autossh/autossh -M 0    216.24.255.63 -CN -R 2222:127.0.0.1:22 -o StrictHostKeyChecking=no
root         937  0.0  0.0  95532  5344 ?        Ss   01:40   0:00 /usr/sbin/sshd


In [None]:
# !cat ~/.ssh/authorize_keys
# -o StrictHostKeyChecking=no 
# !passwd 
# !ssh 216.24.255.63 -CN -R 2222:127.0.0.1:22
# !autossh -M 0 -f 216.24.255.63 -CN -R 2222:127.0.0.1:22

# !which autossh

In [None]:
# !which sshd
# !ls -l /etc/init.d/ssh
# !sudo systemctl status ssh
# !ps aux|grep sshd
# import random
# import string
# ''.join(random.choice(string.ascii_letters + string.digits) for i in range(20))

In [None]:
print("To test the reverse channel in the remote computer:")
print("$ curl -I 127.0.0.1:2222")
print("> Weird server reply -> OK ")
print("> Connection refused -> Not OK")
print("\nIf OK, to connect to Colab computer from the remote computer:")
print("$ ssh -p 2222 127.0.0.1 -o StrictHostKeyChecking=no")

To test the reverse channel in the remote computer:
$ curl -I 127.0.0.1:2222
> Weird server reply -> OK 
> Connection refused -> Not OK

If OK, to connect to Colab computer:
$ ssh -p 2222 127.0.0.1 -o StrictHostKeyChecking=no


In [None]:
# get_ipython().system_raw('ls -l')
# get_ipython().system('ls -l')
# get_ipython().system? 
# reboot: !reboot vs !kill -9 -1
# restart runtime/jupyter ctrl-M. vs os._exit() vs !kiill {os.getpid()}
import os


57

In [None]:
# or use ngrok 
# https://medium.com/@meet_patel/how-to-ssh-into-google-colab-and-run-scripts-from-terminal-instead-of-jupyter-notebook-3931f2674258

In [None]:
import subprocess as sp
from shlex import split
def run_cmd(cmd):
    cmd = split(cmd)
    try:
        sp.check_output(cmd, stderr=sp.STDOUT, encoding="utf8")
    except Exception as e:
        print(e)
        print(e.output)
        print(e.returncode)
run_cmd("ls0 ls0")

[Errno 2] No such file or directory: 'ls0': 'ls0'


AttributeError: ignored