# Project – Developing an Alpha Zero Game Player

### Laboratory of Artificial Intelligence and Data Science

#### TP1 - Group 7:
##### Anna Sellani
##### Gonçalo Dias
##### Tomás Azevedo
##### Vicente Bandeira

In [None]:
from MCTS import MCTS
import tensorflow as tf
import optimizar,avaliar,selfplay
import Go,Attaxx
from ioannina import Neura, get_best_name,exports,optsimport socket
import time
from Go import GameState as Go, setScreen as set_screen_go, drawBoard as draw_board_go, drawPieces as draw_pieces_go, drawResult as draw_result_go
from Attaxx import GameState as Attaxx, move as execute_move, _objective_test as is_game_finished_attaxx, get_moves, setScreen as set_screen_attaxx, drawBoard as draw_board_attaxx, drawPieces as draw_pieces_attaxx, drawResult as draw_result_attaxx
import pygame

## Server protocol

In [None]:
games = ["A4x4", "A5x5", "A6x6", "G7x7", "G9x9"]
game = games[0]     # ATTAXX
game = games[3]     # GO

INVALID_LIMIT = 2
TIME_LIMIT = 10  # (seconds)

In [None]:
def is_move_valid_go(game: Go, move):    # implementing the logic to check if the move is valid
    return move in game.check_possible_moves(game)

In [None]:
def start_server_go(host='localhost', port=12345):
    server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    server_socket.bind((host, port))
    server_socket.listen(2)

    print("Waiting for two agents to connect...")
    agent1, addr1 = server_socket.accept()
    print("Agent 1 connected from", addr1)
    bs=b'AG1 '+game.encode()
    agent1.sendall(bs)

    agent2, addr2 = server_socket.accept()
    print("Agent 2 connected from", addr2)
    bs=b'AG2 '+game.encode()
    agent2.sendall(bs)
       
    n = int(game[1])
    initial_board = np.zeros((n, n),dtype=int)     # initializing an empty board of size (n x n)
    GameState = Go(initial_board)    # initializing the game
    
    pygame.init()
    screen = set_screen_go()    # setting the screen for graphical display
    draw_board_go(GameState, screen)
    pygame.display.update()

    agents = [agent1, agent2]
    current_agent = 0

    jog=0
    invalid_count = 0   # consecutive invalid moves count 
    
    time.sleep(3)
    while True:
        try:
            data = None
            data = agents[current_agent].recv(1024).decode()
            if not data:
                break

            if data == "PASS":
                agents[current_agent].sendall(b'VALID')
                agents[1-current_agent].sendall(data.encode())
                GameState.pass_turn()
            else:
                # processing the move (example: "MOVE X,Y")
                i = int(data[5])
                j = int(data[7])
                if current_agent == 0:
                    print("Agent 1 -> ",data)
                else:
                    print("Agent 2 -> ",data)
                jog = jog+1
                
                # checking if the move is valid and, if so, executing it
                if is_move_valid_go(GameState,(i,j)):
                    agents[current_agent].sendall(b'VALID')
                    agents[1-current_agent].sendall(data.encode())
                    GameState = GameState.move((i,j))
                    time.sleep(0.1)
                    draw_board_go(GameState, screen)
                    draw_pieces_go(GameState, screen)
                    event = pygame.event.poll()
                else:
                    agents[current_agent].sendall(b'INVALID')
                    invalid_count += 1
                    if invalid_count < INVALID_LIMIT:   # if invalid count reaches 5, then the agent passes
                        continue
                    agents[current_agent].sendall(b'TURN LOSS')
                    agents[1-current_agent].sendall(b'PASS')
                    GameState = GameState.pass_turn()
                    invalid_count = 0
                
            pygame.display.update()
                
            # checking if the game is over
            if GameState.is_game_finished():
                GameState.end_game()
                winner = GameState.winner
                if winner == -1:
                    winner = 2
                p1_score = GameState.scores[1]
                p2_score = GameState.scores[-1]
                data = "END " + str(winner) + " " + str(p1_score) + " " + str(p2_score)
                agents[current_agent].sendall(data.encode())
                agents[1-current_agent].sendall(data.encode())
                draw_result_go(GameState, screen)
                pygame.display.update()
                time.sleep(4)
                pygame.quit()
                break
                
            # Switch to the other agent
            current_agent = 1-current_agent

        except Exception as e:
            print("Error:", e)
            break

    print("\n-----------------\nGAME END\n-----------------\n")
    time.sleep(1)
    agent1.close()
    agent2.close()
    server_socket.close()

In [None]:
def create_board_attaxx(n):
    board = np.zeros((n, n),dtype=int)
    board[0][0] = board[n-1][n-1] = 1
    board[n-1][0] = board[0][n-1] = -1
    return board

In [None]:
def is_move_valid_attaxx(GameState,i,j,i2,j2):
    possible_moves = [(1,0),(2,0),(1,1),(2,2),(1,-1),(2,-2),(-1,0),(-2,0),(-1,1),(-2,-2),(0,1),(0,2),(0,-1),(0,-2),(-1,-1),(-2,2)]
    move = (i2-i,j2-j)
    if move not in possible_moves:
        return False
    moves = get_moves(GameState,(i,j))
    return moves[move][0]

In [None]:
def start_server_attaxx(host='localhost', port=12345):
    server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    server_socket.bind((host, port))
    server_socket.listen(2)

    print("Waiting for two agents to connect...")
    agent1, addr1 = server_socket.accept()
    print("Agent 1 connected from", addr1)
    bs=b'AG1 '+game.encode()
    agent1.sendall(bs)

    agent2, addr2 = server_socket.accept()
    print("Agent 2 connected from", addr2)
    bs=b'AG2 '+game.encode()
    agent2.sendall(bs)
       
    n = int(game[1])
    initial_board = create_board_attaxx(n)     # initializing an empty board of size (n x n)
    GameState = Attaxx(initial_board)    # initializing the game

    pygame.init()
    screen = set_screen_attaxx()    # setting the screen for graphical display
    draw_board_attaxx(GameState, screen)
    draw_pieces_attaxx(GameState, screen)
    pygame.display.update()

    agents = [agent1, agent2]
    current_agent = 0
    player_id = 1

    jog=0
    invalid_count = 0   # consecutive invalid moves count 
    
    time.sleep(3)
    while True:
        try:
            data = None
            data = agents[current_agent].recv(1024).decode()
            if not data:
                break

            # processing the move (example: "MOVE X,Y,X2,Y2")
            i = int(data[5])
            j = int(data[7])
            i2 = int(data[9])
            j2 = int(data[11])
            if current_agent == 0:
                print("Agent 1 -> ",data)
            else:
                print("Agent 2 -> ",data)
            jog = jog+1
            
            # checking if the move is valid and, if so, executing it
            if is_move_valid_attaxx(GameState,i,j,i2,j2):
                agents[current_agent].sendall(b'VALID')
                agents[1-current_agent].sendall(data.encode())
                GameState = execute_move(GameState,(i,j),(i2,j2),player_id=player_id)
                time.sleep(0.1)
                draw_board_attaxx(GameState, screen)
                draw_pieces_attaxx(GameState, screen)
                event = pygame.event.poll()
            else:
                invalid_count += 1
                if invalid_count < INVALID_LIMIT:   # if invalid count reaches 5, then the agent passes
                    agents[current_agent].sendall(b'INVALID')
                    continue
                agents[current_agent].sendall(b'TURN LOSS')
                agents[1-current_agent].sendall(b'PASS')
                GameState.switchPlayer()
                invalid_count = 0

            pygame.display.update()

            # checking if the game is over
            value,score1,score2 = is_game_finished_attaxx(GameState,player=player_id)   # -1 if game is not over, 0 if it's a draw, 1 if player 1 won and 2 if player 2 won 
            if value != -2:
                result = value
                if result == 1:
                    p1_score = score1
                    p2_score = score2
                else:
                    p2_score = score1
                    p1_score = score2
                data = "END " + str(result) + " " + str(p1_score) + " " + str(p2_score)
                agents[current_agent].sendall(data.encode())
                agents[1-current_agent].sendall(data.encode())
                draw_result_attaxx(GameState, screen)
                pygame.display.update()
                time.sleep(4)
                pygame.quit()
                break
                
            # Switch to the other agent
            current_agent = 1-current_agent
            player_id = 0-player_id
            time.sleep(2)

        except Exception as e:
            print("Error:", e)
            break

    print("\n-----------------\nGAME END\n-----------------\n")
    time.sleep(1)
    agent1.close()
    agent2.close()
    server_socket.close()

In [None]:
if __name__ == "__main__":
    if game[0]=='G':
        start_server_go()
    elif game[0]=='A':
        start_server_attaxx()

## Monte Carlo Tree Search

In [None]:
import math
import numpy as np
from go.inputconverter import *
import time
import decimal

select, expand and evaluate, backup, play<br/>
<br/>
APV-MCTS variant <br/>
<br/>
N = visit_count<br/>
W = total_action_value<br/>
Q = mean_action_value<br/>
P = prior_prob of selecting that edge<br/>
exploration constant = cpuct <br/>
<br/>
Q = W/N # controlls exploitation<br/>
U = cput*p*(math.sqrt(sum_N)/(1+N)) # controlls exploration<br/>
<br/>
edges (moves)<br/>
nodes (positions/states)

In [None]:
class Node:
    def __init__(self, game_state, args, mcts, parent=None, p_action=None, prior_prob=0,play_idx=0):
        self.game_state=game_state
        self.args=args
        self.parent=parent
        self.p_action=p_action
        self.prior_prob=prior_prob  # P
        self.children=[]
        self.visit_count=0  # N
        self.total_action_value=0   # W
        self.possible=self.game_state.n**2+self.game_state.type
        self.mcts=mcts
        self.play_idx=play_idx

    def fully_expanded(self):
        return len(self.children)>0     # if no expandable moves and there are children
    
    def select(self):   # chooses child with best ucb 
        if not self.fully_expanded():
            return self
        selected = max(self.children, key=lambda child: self.ucb(child))
        return selected.select()
    
    def cpuct(self, visit_count):   # defining cpuct according to paper
        return math.log((visit_count+19652+1)/19652)+self.args['cpuct']
    
    def ucb(self, child):   # uses variant of the PUCT algorithm
        if child is None:       # to avoid 'NoneType' error
            return 0
        if child.visit_count==0:
            mean_action_value=0     
        else:
            mean_action_value=child.total_action_value/child.visit_count        # mean_action_value Q=W/N
        return mean_action_value+self.cpuct(self.visit_count)*child.prior_prob*(math.sqrt(self.visit_count)/(1+child.visit_count))

    def expand(self, p):
        for _ in range(self.possible):
            action=self.mcts.get_act(_)
            if action in self.game_state.empty_positions or action==(-1,-1):    # to avoid 'NoneType' error
                next_state = self.game_state.move(action)
                child = Node(next_state,self.args, parent=self, p_action=action, prior_prob=p[_],mcts=self.mcts,play_idx=self.play_idx+1)
                self.children.append(child)
    
    def backprop(self, v):
        self.total_action_value  += v
        self.visit_count += 1
        if self.parent is not None:
            self.parent.backprop(v)

In [None]:
class MCTS:
    def __init__(self, game_state, args, model,eva=False):
        self.game_state=game_state
        self.args=args
        self.model=model
        self.evaluate=eva
        self.ti=self.setind(game_state)
        self.root=Node(self.game_state, self.args,self)
        self.pi=np.zeros(self.game_state.n**2+self.game_state.type)
        self.map=self.map_act()

    def setind(self,game):  # temperature according to game and boards
        if game.type==0:
            match len(game.board):
                case 4:
                    tind=2
                case 6:
                    tind=3
        else:
            match len(game.board):
                case 7:
                    tind=5
                case 9:
                    tind=7
        return tind
    
    def map_act(self):  # mapping actions
        list=[]
        if self.game_state.name=='attaxx':
            for i in range(len(self.game_state.board)):
                for j in range(len(self.game_state.board[0])):
                    list.append((j,i))
        else:
            for i in range(len(self.game_state.board)):
                for j in range(len(self.game_state.board[0])):
                    list.append((i,j))
            list.append((-1,-1))
        return list

    def get_act(self,_):
        return  self.map[_]
    
    def cut(self,action):   # new root node is the child corresponding to the played action
        for child in self.root.children:
            if child.p_action==action:
                self.root=child
                self.pi=np.zeros(self.game_state.n**2+self.game_state.type)

    def printTree(self, node, level=0, prefix=""):  # analysis purposes only
        if node is not None:
            print(" " * level * 2 + f"{prefix}+- action: {node.p_action}, N: {node.visit_count}, W: {node.total_action_value}")
            for i, child in enumerate(node.children):
                self.printTree(child, level + 1, f"{prefix}|  " if i < len(node.children) - 1 else f"{prefix}   ")

    def get_play(self,passe=None):  # chooses move based on maximum pi
        max_val=0
        ind=[]
        for i in range(len(self.pi)):
            if i==passe:
                continue
            val=self.pi[i]
            if val>max_val:
                ind=[i]
                max_val=self.pi[i]
                continue
            if val==max_val:
                ind.append(i)
        return random.choice(ind)
    
    def play(self):
        for _ in range(self.args['num_searches']):
            node=self.root

            # selection
            while node.fully_expanded():
                node=node.select()

            # check if node is terminal or not
            terminal=self.game_state.is_game_finished()
            
            # expand and evaluate
            if not terminal:
                if self.game_state.type==1:
                    board=gen_batch(node.game_state)
                else:
                    board=node.game_state.board
                p, v = self.model.net.predict(np.array([board]),batch_size=1,verbose=0)
                p=p[0]
                v=v[0][0]
                if self.root.play_idx-1>self.ti or self.evaluate:
                    p=0.75*p+0.25*np.random.dirichlet([0.2,0.2,0.2])[0]     # adding Dirichlet noise to root's prior 

                node.expand(p)      # adding children with policy from the NN to list children
            
            # backpropagate
            node.backprop(v)

        if self.root.play_idx-1<=self.ti and not self.evaluate:
            temp=1
        else:
            temp=10**(-2)

        sumb=decimal.Decimal(0)
        for child in self.root.children:
            if child is None or child.visit_count==0:
                continue
            else:
                sumb+=(decimal.Decimal(child.visit_count)**decimal.Decimal(1/temp))
                    
        for child in self.root.children:
            if child is None:
                continue
            if child.visit_count == 0:
                self.pi[self.map.index(child.p_action)] = 0
            elif child.visit_count == 1:
                self.pi[self.map.index(child.p_action)] = 0.1
            else: 
                self.pi[self.map.index(child.p_action)] = (float)((decimal.Decimal(child.visit_count)**decimal.Decimal((1/temp)))/(sumb))
        
        pol=self.pi
        max_prob_index=self.get_play()
        if max_prob_index == self.game_state.n**2:
            self.cut((-1,-1))
            return (-1, -1),pol     # define this as "pass"
        else:
            played=((max_prob_index // self.game_state.n), (max_prob_index % self.game_state.n))    # convert 1D array index to 2D array coordinates
            print(f"Play chosen: {played}")     # analysis purposes only
            self.cut(played) 
            return played,pol


## Neural Network

The following folllows an adaptation of the original AlphaZero proposed by He, K. et Al. (2016).

Main changes follow:

    * number of filters.
    * attaxx input is 2D.

Input for Go is the proposed t,...,t-7 original approach and an 8-padded Attaxx board for size flexibility

In [1]:
import tensorflow as tf
from tensorflow import keras
from keras import layers,regularizers,optimizers
import numpy as np
from keras.models import Model
import os, random
import names
from go.inputconverter import *
from shutil import copy
import math
from avaliar import makegame

pygame 2.1.2 (SDL 2.0.18, Python 3.10.0)
Hello from the pygame community. https://www.pygame.org/contribute.html


In [2]:
class Neura:
    def __init__(self,game,name=None): 
        self.input(game)
        self.game=game
        self.res=19
        self.build(self.res,self.nf)
        if name=='acacio':                          # for alternative approach testing
            self.name='acacio'+game.name+str(len(game.board))+str(self.res)
        elif name==None:
            self.name=names.get_last_name()+game.name+str(len(game.board))+str(self.res)
            self.net.save_weights(f'modelos/{game.name}/{str(len(game.board))}/{self.name}.h5')
        else:
            self.name=name
            self.net.load_weights(f'modelos/{game.name}/{str(len(game.board))}/{self.name}.h5')
        

    def input(self,game):
        if (game.type==0):
            self.nf=math.ceil(len(game.board)**2*0.709+len(game.board)**2)                 
            self.inpt=layers.Input(shape=(len(game.board),len(game.board[0]),1))
            self.passes=0
        else:
            self.nf=math.ceil(len(game.board)**2*0.709+len(game.board)**2)
            self.passes=1
            self.inpt=layers.Input((len(game.board),len(game.board[0]),17))
        self.action_space=len(game.board)*len(game.board[0])+self.passes

    def convblock(self,input,nf):
        c=layers.Conv2D(nf,3,(1,1),'same',kernel_regularizer=regularizers.L2(0.0001))(input)
        b=layers.BatchNormalization()(c)
        rnl=layers.Activation(activation='softplus')(b)
        return rnl

    def resblock(self,input,i,nf):
        cb=self.convblock(input,nf)
        c=layers.Conv2D(nf,3,(1,1),'same',kernel_regularizer=regularizers.L2(0.0001))(cb)
        b=layers.BatchNormalization()(c)
        s=layers.Add()([b,input])
        rnl=layers.Activation(activation='softplus',name=f'endrestower{i}')(s)
        return rnl
    
    def polhead(self,input):
        c=layers.Conv2D(2,1,(1,1),'same',name='convpol',kernel_regularizer=regularizers.L2(0.0001))(input)
        b=layers.BatchNormalization(name='bnpol')(c)
        rnl=layers.Activation(activation='softplus',name='rnlpol')(b)
        flt=layers.Flatten(name='polflat')(rnl)
        fc=layers.Dense(units=self.action_space,name='polout',kernel_regularizer=regularizers.L2(0.0001))(flt)  
        return fc

    def valhead(self,input,nf):
        c=layers.Conv2D(1,1,(1,1),'same',kernel_regularizer=regularizers.L2(0.0001))(input)
        b=layers.BatchNormalization()(c)
        rnl=layers.Activation(activation='softplus')(b)
        flt=layers.Flatten()(rnl)
        fcl=layers.Dense(nf,kernel_regularizer=regularizers.L2(0.0001))(flt)
        rnl2=layers.Activation(activation='softplus')(fcl)
        fcs=layers.Dense(1,kernel_regularizer=regularizers.L2(0.0001))(rnl2)
        tanh=layers.Activation(activation='tanh',name='valout')(fcs)
        return tanh    
    
    def build(self,n_res,nf,):
        conv=self.convblock(self.inpt,nf)
        restower=conv
        for i in range(n_res):
            restower=self.resblock(restower,i,nf)
        polh=self.polhead(restower)
        valh=self.valhead(restower,nf)
        outputs=[polh,valh]
        self.net=Model(self.inpt,outputs)
        return

    def summary(self):
        self.net.summary()
        return
    
    def compilar(self,lr=0.01):
        self.net.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=lr,momentum=0.9),loss={'polout':tf.keras.losses.CategoricalCrossentropy(),'valout':tf.keras.losses.MeanSquaredError()})


    # Funções logísticas para armazenamento e substituição de modelos
    def copy_weights(self,bestname):
        src=f'modelos/{self.game.name}/{str(len(self.game.board))}/best/{bestname}.h5'
        dest=f'modelos/{self.game.name}/{str(len(self.game.board))}/{bestname}.h5'
        copy(src, dest)
    
    def make_best(self):
        os.remove((f'modelos/{self.game.name}/{str(len(self.game.board))}/best/{self.get_best_name()}.h5'))
        self.net.save_weights(f'modelos/{self.game.name}/{str(len(self.game.board))}/best/{self.name}.h5')

    def get_best_name(self):
        folder_path = f"modelos/{self.game.name}/{len(self.game.board)}/best"
        entries = os.listdir(folder_path)
        for e in entries:
            file_name = e
            break
        return file_name[:-3]

Auxiliary Functions:

In [4]:
def get_best_name(game):
    folder_path = f"modelos/{game.name}/{len(game.board)}/best"
    entries = os.listdir(folder_path)
    file_name = None
    for e in entries:
        file_name = e
        break
    return file_name[:-3]


# Loss Function, igual à proposta no artigo original, em que o parâmetro de regularização está aplicado nas Layers e não explicitamente na Loss
def sigmaloss(y_true,y_pred):
        pol=y_pred[0]
        pit=np.transpose(np.array(y_true[0]))
        return (y_pred[1]-y_true[1])**2-np.dot(pit,np.log(pol))

In [None]:
game=makegame('G7x7')
model=Neura(game,f'best/{get_best_name(game)}')
model.compilar()

# São necessárias dependências para projetar a rede, em alternativa poderá ser usado o método model.summary()

#tf.keras.utils.plot_model(model.net, show_shapes=True)

## Communications protocol of the agent

In [None]:
def choose_move_go():
    pass

In [None]:
def choose_move_attaxx():
    pass

In [None]:
def choose_move(game_name):   # returns the move in the form "MOVE X,Y"
    if game_name=='go':
        return choose_move_go()
    else:
        return choose_move_attaxx()

In [None]:
def connect_to_server(host='localhost', port=12345):
    client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    client_socket.connect((host, port))

    response = client_socket.recv(1024).decode()
    print(f"Server ResponseINIT: {response}")

    Game = response[-4:]
    print("Playing:", Game)
    if Game[0]=='A':
        game_name = 'attaxx'
    else:
        game_name = 'go'
    n = int(Game[1])

    if "1" in response:
        ag=1
    else:
        ag=2
    first=True

    game_state=avaliar.makegame(Game)
    teta=Neura(game_state,'Eakesgo7')
    alpha=MCTS(game_state,ARGS,teta)

    while True:
        # Generate and send a random move
        if ag == 1 or not first:
            move = alpha.play()
            time.sleep(1)
            smove=str(move)
            client_socket.send(smove.encode())
            print("Send:",move)
        
            # Wait for server response
            response = client_socket.recv(1024).decode()
            print(f"Server Response1: {response}")
            if response == "INVALID":
                continue
            if "END" in response: break
            game_state=game_state.move(move)
            
        first=False
        response = client_socket.recv(1024).decode()
        if response == "PASS":
            game_state = game_state.pass_turn()
        else:
            i=response[5]
            j=response[7]
            if game_name == "attaxx":
                i2=response[9]
                j2=response[11]
        action=(int(i),int(j))
        print(f"Server Response2: {response}")
        if "END" in response: break
        game_state=game_state.move(action)
        alpha.cut(action)

    client_socket.close()

In [None]:
if __name__ == "__main__":
    connect_to_server()

# Results

### Attaxx

There are no results to show since no model was built successfully due to incompletness of the selfplay phase. Any contrary indications found anywhere else in this project are the result of group misscommunication. We can only tell that the flexibility feature is working as expected. Nothing else to report. 

### Go9x9

Due to the implications of computational power required no further development was carried.

### Go7x7

The following is a description of empirical analysis of the training process:

    The first trained model learned fairly and was subsequently replaced by the next checkpoint. From here, no other checkpoint could beat this one. This model learned a strategy to corner the bottom line*. 'Howardgo719'
    In an attempt to surpass this stranded model. Additional regularization was introduced and weights were razed and optimized again. This model beat the previous best and showed significant improvement but did not surpass the evaluation threshold. (Its weights were overwritten by the new fixed model, which hasn't learned properly yet, unfortunately)
    Due to an error that was discovered in the final hours of the project in the MCTS algorithm that rendered all the past optimization, data generation and evaluation meaningless. We are hopeful this correction enables future training to be efficient and the future modules to be increasingly better. We haven't had time to test it yet.

    * to check for this, one has to rollback the error in the MCTS (making the policy formula to the wrong (self.root.visit_count/child.visit_count)**(1/t) for child in root.children) 

# References

- David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton,   
  Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel & Demis Hassabis. (2017, October 19).   
    **Mastering the game of Go without human knowledge**. doi:10.1038/nature24270

- Johannes Czech, Patrick Korus & Kristian Kersting. (2020, December 22)
    **Monte-Carlo Graph Search for AlphaZero**. arXiv:2012.11045v1 [cs.AI] 

- He, K., Zhang, X., Ren, S. & Sun, J. **Deep residual learning for image recognition**.
  In Proc. 29th IEEE Conf. Comput. Vis. Pattern Recognit. 770–778 (2016).