Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CGWire #15

Merged
merged 13 commits into from
Jun 17, 2018
Merged

CGWire #15

merged 13 commits into from
Jun 17, 2018

Conversation

tokejepsen
Copy link
Collaborator

This is a PR to include CGWire into Avalon's distribution.

@tokejepsen tokejepsen requested a review from mottosso June 9, 2018 20:29
@tokejepsen
Copy link
Collaborator Author

tokejepsen commented Jun 9, 2018

There are currently two issues to tackle:

  • mongod does not get execute when running the container, because start_zou.sh is blocking.
  • Cant log into Kitsu with user: admin@example and password: default.

capture

@mottosso
Copy link
Contributor

mottosso commented Jun 9, 2018

Whop! There's a lot to condense here. Off the top of my head:

  • For starters we should either get CGWire out of supervisord, which it is using for running services in the background, or we get Mongo and Samba into supervisord.
  • Next there are far to many layers (RUN statements).
  • After that I'd preferably get rid of the files.. although.. I'm not too sure about that. On the upside I'd rather have a single, long Dockerfile than lots of small files. Less to manage, better understanding of the overall complexity when it's under a single roof. On the flipside, there are quite a few of those files.. if there was some way of getting rid of the files altogether, and them including the init and run .sh scripts with the Dockerfile, that could be a win.

@tokejepsen
Copy link
Collaborator Author

Mongo can be run as a daemon with --fork. It requires to log activity to a file, which is currently /avalon/mongo.log.

@mottosso
Copy link
Contributor

I whipped up a script to sync CGWire -> Avalon.

import os
import gazu
from avalon import io as avalon

# Note: global..
gazu.client.set_host("http://192.168.99.100/api")

# Note: plain-text..
gazu.log_in("admin@example.com", "default")

print("Logged in..")
projects = []
objects = []

for project in gazu.project.all_projects():
    assets = gazu.asset.all_assets_for_project(project)
    shots = gazu.shot.all_shots_for_project(project)

    for assets, silo in ((assets, "assets"), (shots, "shots")):
        for asset in assets:
            objects.append({
                "schema": "avalon-core:asset-2.0",
                "name": asset["name"].replace(" ", ""),  # remove spaces
                "silo": silo,
                "data": {},
                "type": "asset",
                "parent": project["name"],
            })

    projects.append({
        "schema": "avalon-core:project-2.0",
        "type": "project",
        "name": project["name"],
        "data": {},
        "parent": None,
        "config": {
            "schema": "avalon-core:config-1.0",
            "apps": [],
            "tasks": [
                {"name": task["name"]}
                for task in gazu.task.all_task_types()
            ],
            "template": {
                "work":
                    "{root}/{project}/{silo}/{asset}/work/"
                    "{task}/{app}",
                "publish":
                    "{root}/{project}/{silo}/{asset}/publish/"
                    "{subset}/v{version:0>3}/{subset}.{representation}"
            }
        }
    })

print("%d projects" % len(projects))
print("%d assets" % len(objects))

os.environ["AVALON_PROJECTS"] = r""
os.environ["AVALON_PROJECT"] = "temp"
os.environ["AVALON_ASSET"] = "bruce"
os.environ["AVALON_SILO"] = "assets"
os.environ["AVALON_CONFIG"] = "polly"
os.environ["AVALON_MONGO"] = "mongodb://192.168.99.100:27017"

existing_projects = {}
existing_assets = {}
installed_projects = []

print("Fetching Avalon data..")
avalon.install()
for project in avalon.projects():
    existing_projects[project["name"]] = project

for asset in avalon.find({"type": "asset"}):
    existing_assets[asset["name"]] = asset


print("Synchronising..")
for project in projects:
    if project["name"] in existing_projects:
        continue

    print("Installing project: %s" % project["name"])
    os.environ["AVALON_PROJECT"] = project["name"]
    avalon.uninstall()
    avalon.install()

    avalon.insert_one(project)


for asset in objects:
    if asset["name"] in existing_assets:
        continue

    asset["parent"] = avalon.locate([asset["parent"]])
    print("Installing asset: %s" % asset["name"])
    avalon.insert_one(asset)

Here's what I'm thinking.

  1. Down the line, I expect Avalon to be listening for events happening in CGWire, and synchronise as it happens.
  2. Synchronisation would happen both ways, from changes happening in Avalon -> CGWire and vice versa.
  3. Until then, a simple polling synchronisation should suffice, whereby Avalon polls CGWire for changes at a fixed interval, such as every 10 seconds. The above script is what I expect can run every 10 seconds.

This should help us prove the concept, before going much further.

Writing this, it became rather clear that our io.py is ill devised. It's already showing legacy tendencies from the days before Launcher was made, such as expecting the environment to be fully qualified with project and all before being used. For a rainy day, we could have a look at breaking that dependency and making it into a more generic database browsing utility which it was ultimately made to be.

@tokejepsen
Copy link
Collaborator Author

Very cool.

Would be good to have a centralized api or similar that needs implementing for other project managers, so it easier to know what needs doing. Would the io.py be the place for this?

@mottosso
Copy link
Contributor

Yes, io.py is the gateway to the database. inventory.py builds on top of that, and is effectively a higher level version of io.py, with an understanding of assets and things.

We should pop up a separate "refactoring PR" about it, but in a nutshell, one of the higher level goals of io.py was to avoid having to explicitly reference the server and project whenever accessing the database from within a host.

That is, I wanted:

from avalon import api
for asset in api.ls():
  print(asset)  # list assets from current project

As opposed to..

from avalon import api

client = api.Client("mongodb://192.168.99.100:27017")
db = client["avalon"][api.current_project()]
for asset in api.ls(db):
  print(asset)

The cost of the convenience however is more "under the hood" stuff, like having a project and db address set in the environment, upfront, which in retrospect complicates other aspects like what we're trying to do right now.

@mottosso
Copy link
Contributor

One more thing about the synchronisation.

Initially I was thinking that maybe it'd be worth switching Avalon to using the CGWire database entirely; skip the synchronisation step. But having interacted with it, on the surface there are a few problems with that.

  1. There is a ton of assumptions made to our disfavour, primarily the fixed structure of project, episode, sequence and shot. Fixed all the way into the individual function calls, which includes the words themselves; very hard to refactor.
  2. Reading/writing (to Postgres) is surprisingly slow; I counted 40 calls/sec on getting 3 users from it. Compare this to the getting 3 assets from Mongo at 45,000 requests/sec (see below). This is important, because we've been building GUIs to leverage this speed by relying on it being fast, to avoid things like caching, progress bars and timeglasses, and overall being really really fast on any queries which means we can make a lot more complicated queries where necessary.
  3. Finally, switching to CGWire would still involve a synchronisation step with other frameworks like Shotgun and ftrack, so we aren't gaining much anyway.
import timeit
num = 100
dur = timeit.timeit(lambda: avalon.find({"type": "asset"}), number=num)
print("%.3f/sec" % (num / dur))

TLDR; we should stick to Mongo internally.

@tokejepsen
Copy link
Collaborator Author

Currently cant get a working version of this, not even to the point of getting the Kitsu website up. Running into this issue:

Step 30/34 : RUN echo Initialising Zou... &&     /opt/zou/init_zou.sh
 ---> Running in f9452a43292f
/bin/sh: 1: /opt/zou/init_zou.sh: not found

But the file /opt/zou/init_zou.sh clearly is present and copied just a couple of lines earlier.

@tokejepsen
Copy link
Collaborator Author

Interestingly I had to switch the line-endings on init_zou.sh to Unix style from Windows style endings.

I assume this is because I'm developing on Windows and when cloning the repository, Git assume Windows style endings.

@tokejepsen
Copy link
Collaborator Author

First working dockerfile. Woop woop!

capture

Apart from the line endings problem, there is also an issue when creating the default admin user with zou. The email is being checked for validity here, so it can't be the example admin@example.com.
Don't know how CGWire did this, with that email.

@tokejepsen
Copy link
Collaborator Author

I have managed to remove all but two external files. I didn't remove nginx.conf and supervisord.conf because they are quite long, and I could not figure out a way of creating the files in the Dockerfile.

I'm also not entirely happy about the syntax of creation of the files, because its a lot of echo run commands. Also it can probably be improved for readability.

Lastly I have temporary disabled the creation of the admin@example.com user because of this.

@tokejepsen
Copy link
Collaborator Author

For starters we should either get CGWire out of supervisord, which it is using for running services in the background, or we get Mongo and Samba into supervisord.

It'll probably very easy to get Mongo and Samba in supervisord. Do you mean we'll use supervisor to reduce the entrypoint to supervisord -c /etc/supervisord.conf ?

@mottosso
Copy link
Contributor

It'll probably very easy to get Mongo and Samba in supervisord. Do you mean we'll use supervisor to reduce the entrypoint to supervisord -c /etc/supervisord.conf ?

Yeah, seems reasonable I think.

@tokejepsen
Copy link
Collaborator Author

That is Mongo and Samba running in supervisor. supervisor is actually quite a neat framework.

Dunno what do about the existing external files and the amount of RUN commands.

Should this PR be with the CGWire sync script as well?
Imagining this to be run through supervisor.

@mottosso
Copy link
Contributor

Should this PR be with the CGWire sync script as well?

I think we can make that an independent PR. Think we have a few things to work out that doesn't involve getting CGWire up and running.

Imagining this to be run through supervisor.

Yeah, I think a plain Python process running on an infinite while loop should suffice, running some function and sleeping for 10 seconds.

Then we can open up a dialog with Frank about what we need from CGWire in terms of callbacks to get rid of it.

@mottosso
Copy link
Contributor

Dunno what do about the existing external files and the amount of RUN commands.

I'll do a pass over it now, see what I can do.

@tokejepsen
Copy link
Collaborator Author

This is looking good to me. Think we still need to figure out the email issue before we can merge.

@mottosso
Copy link
Contributor

mottosso commented Jun 17, 2018

Ok, works!

$ docker run \
  --name avalon \
  -e AVALON_USERNAME=avalon \
  -e AVALON_EMAIL=avalon@getavalon.github.io \
  -e AVALON_PASSWORD=default \
  -v avalon-db:/data/db \
  --rm -ti \
  -p 445:445 \
  -p 27017:27017 \
  -p 80:80 avalon/docker:0.4

Where the -e are optional.

@tokejepsen
Copy link
Collaborator Author

Nicely done! I'm pretty happy with this. Merge?

@mottosso mottosso merged commit 0d7e43f into master Jun 17, 2018
@mottosso
Copy link
Contributor

Done! Would you like to give the multi-container approach a try next?

@tokejepsen
Copy link
Collaborator Author

Sure, lets

@mottosso
Copy link
Contributor

Oh no, I just realised something. The email is currently hardcoded into the image, as it's being installed on build, not on run.

Something to keep in mind for the multi approach.

@tokejepsen
Copy link
Collaborator Author

That is a good point. Maybe we should have it as part of the entry points, that it'll make an admin account if an email is passed in?

Even more of a reason for splitting the tracking container from the rest. Going to be very specific behaviour for Ftrack and friends.

@tokejepsen
Copy link
Collaborator Author

The email is currently hardcoded into the image, as it's being installed on build, not on run.

Think the approach will be that admin@example.com will be the default admin user, and from that account users will create other admin accounts with more secure login, then delete the admin@example.com account.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants