## Multi-Host Data Sharding

Please follow the awesome [TPU Starter](https://github.com/ayaka14732/tpu-starter).

<details>
<summary>
(Click to expand) For macos iTerm users, consider diagnosing all hosts with the following AppleScript:
</summary>

```osascript
tell application "iTerm"
	activate
	create window with default profile
	
	tell current session of current window
		write text "gcloud compute tpus tpu-vm ssh tpu-name --worker=0"
	end tell
	
	set sessionList to {current session of current window}
	
	repeat with i from 1 to 7
		tell current session of current window
			set newSession to (split horizontally with default profile)
		end tell
		delay 0.5
		set end of sessionList to newSession
	end repeat
	
	repeat with i from 1 to 7
		tell item (i + 1) of sessionList
			write text "gcloud compute tpus tpu-vm ssh tpu-name --worker=" & i
		end tell
	end repeat
end tell
```
</details>

Then press `cmd+shift+I` to spread commands to each window.

In [None]:
!IPYTHONDIR=. ipcontroller --ip='*'  # Start this on the controller host,
!podrun -iw -- /home/evergreen/.local/bin/ipengine --file=profile_default/security/ipcontroller-engine.json  # Start this on each engine host

In [1]:
import os
os.environ['IPYTHONDIR']="."
import ipyparallel as ipp

# Connect to the IPython cluster
rc = ipp.Client()
rc.ids  # List the engine IDs

[0, 1, 2, 3, 4, 5, 6, 7]

In [2]:
# Create a DirectView of all engines
dview = rc[:]

# Run a Python command on all engines
dview.execute('import socket')
dview.execute('hostname = socket.gethostname()')
dview.execute('print(f"Hello from {hostname}")')

# Get results from the engines
results = dview.apply_sync(lambda: socket.gethostname())
print(results)

['t1v-n-30d29099-w-0', 't1v-n-30d29099-w-5', 't1v-n-30d29099-w-6', 't1v-n-30d29099-w-3', 't1v-n-30d29099-w-1', 't1v-n-30d29099-w-7', 't1v-n-30d29099-w-4', 't1v-n-30d29099-w-2']
