New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TV Archives cracked open - "AI for IA" - Artificial Intelligence for Internet Archive #661

Open
mozfest-bot opened this Issue Aug 1, 2017 · 12 comments

Comments

@mozfest-bot
Collaborator

mozfest-bot commented Aug 1, 2017

[ UUID ] b8c004f2-d61b-4fd1-9f0b-68abdeb279aa

[ Session Name ] TV Archives cracked open - "AI for IA" - Artificial Intelligence for Internet Archive
[ Primary Space ] Decentralization
[ Secondary Space ] Privacy and Security

[ Submitter's Name ] Tracey Jaquith
[ Submitter's Affiliated Organisation ] Internet Archive
[ Submitter's Github ] @traceypooh

[ Additional facilitators ] Dan Schultz

What will happen in your session?

archive.org/tv has been recording 60+ worldwide channels, 24x7, since 2000.
We have 2M+ news shows, including new Trump and Congress subsets, as citable reference, for captions/metadata search, AI experiments, Popcorn editing, and more.
Absolute browser Privacy - no personal data or IP addresses extracted.
For nontampering validation, we keep original versions with checksums and logs.
Recent AI: chyron ("lower third") scanning OCR; public officials facial detection; caption correction & alignment; twitter and slack bots with results.
We can store and search arbitrary metadata (eg: ..) as a JSON "clip" with start/end time ranges; can track play counts, referers/embeders.
HELP US shape our API to play with TV, tag clips with metadata or pointers to Decentralized metadata, AI ideas, and more!

What is the goal or outcome of your session?

To work with journalists, researchers, hackers, audio/video and captions and artificial intelligence folks, to learn more about what's available at archive.org/tv and help us extend it with API ideas and experiments. We can show what feeds (examples: continuous captions feed from CSPAN; latest headlines shown on CNN, MSNBC, FOXNEWS, BBCNEWS TV shows) can be used now and should be live soon.
We want to aid journalists and others with various ways to get summaries of what is going on in TV, since it is such a big, nonstop 'firehose' of content.
A secondary goal is to find ideas or requests we don't anticipate!
We anticipate making available many continuous feeds of AI results for all to use.

Time needed

60 mins

@cubicgarden

This comment has been minimized.

Show comment
Hide comment
@cubicgarden

cubicgarden Aug 4, 2017

Collaborator

I like it but unsure if it fits the decentralisation space well?

Collaborator

cubicgarden commented Aug 4, 2017

I like it but unsure if it fits the decentralisation space well?

@erikao erikao self-assigned this Aug 4, 2017

@traceypooh

This comment has been minimized.

Show comment
Hide comment
@traceypooh

traceypooh Aug 7, 2017

I don't disagree!
It seemed it had some:

  • (less commonly in the press) decentralized aspects to it (decentralized metadata and AI)
  • privacy aspects to it (AI and metadata and feeds available without snarfing your metadata/privacy!)
  • even a little of "Open Innovation" and "Digital Inclusion"

... so I tried the top two :) 100% happy to shuffle to wherever it makes more/most sense if it's accepted! 🎉

traceypooh commented Aug 7, 2017

I don't disagree!
It seemed it had some:

  • (less commonly in the press) decentralized aspects to it (decentralized metadata and AI)
  • privacy aspects to it (AI and metadata and feeds available without snarfing your metadata/privacy!)
  • even a little of "Open Innovation" and "Digital Inclusion"

... so I tried the top two :) 100% happy to shuffle to wherever it makes more/most sense if it's accepted! 🎉

@erikao erikao added the Open News label Aug 9, 2017

@cubicgarden

This comment has been minimized.

Show comment
Hide comment
@cubicgarden

cubicgarden Aug 10, 2017

Collaborator

Well the best proposals have a bit of everything :)

Collaborator

cubicgarden commented Aug 10, 2017

Well the best proposals have a bit of everything :)

@traceypooh

This comment has been minimized.

Show comment
Hide comment
@traceypooh

traceypooh Sep 20, 2017

so... was this... accepted? got a slightly confusing email to me overnight :)
if so, is it for a 30m-60h talk slot (have done that in last 5-6y)
or is more like a science fair exhibit table (have done those too :)
thanks and sorry i'm slightly confused!

traceypooh commented Sep 20, 2017

so... was this... accepted? got a slightly confusing email to me overnight :)
if so, is it for a 30m-60h talk slot (have done that in last 5-6y)
or is more like a science fair exhibit table (have done those too :)
thanks and sorry i'm slightly confused!

@cubicgarden

This comment has been minimized.

Show comment
Hide comment
@cubicgarden

cubicgarden Sep 20, 2017

Collaborator

It was accepted into the decentralised space as a 60min talk. Email went out today, feel free to email me back. Thanks Tracey

Collaborator

cubicgarden commented Sep 20, 2017

It was accepted into the decentralised space as a 60min talk. Email went out today, feel free to email me back. Thanks Tracey

@traceypooh

This comment has been minimized.

Show comment
Hide comment
@traceypooh

traceypooh Sep 20, 2017

woohoo! awesome, thanks!

traceypooh commented Sep 20, 2017

woohoo! awesome, thanks!

@kaodro

This comment has been minimized.

Show comment
Hide comment
@kaodro

kaodro Sep 28, 2017

Hi @traceypooh , my name is Kasia and I work on Mozilla’s Internet Health Report. At Mozfest we will be present with an “Internet Research Hub” #618 (last comment) - an easy-going, cozy space for discussions and networking both with us about Internet Health and among researchers themselves.

We also invite anyone who does work in an Internet research field to sign up and present their work at a couple of open display tables we will have in the hub. If you would like to present something in the hub in addition to your official session, let me know! We will promote these sessions throughout the festival. You can sign up spontaneously with pen and paper on site or if you would like to save a spot beforehand, drop me an email at kasia@mozillafoundation.org with a short description of the session.

In any case, I would like to invite you to pass by the hub and say hi. We will start with an informal "Research and coffee grinder” get-together at the beginning of the festival where people can get to know each other. Space and exact schedule for the Hub are still being decided and I will update you once we know the details. Hope to see you there!

kaodro commented Sep 28, 2017

Hi @traceypooh , my name is Kasia and I work on Mozilla’s Internet Health Report. At Mozfest we will be present with an “Internet Research Hub” #618 (last comment) - an easy-going, cozy space for discussions and networking both with us about Internet Health and among researchers themselves.

We also invite anyone who does work in an Internet research field to sign up and present their work at a couple of open display tables we will have in the hub. If you would like to present something in the hub in addition to your official session, let me know! We will promote these sessions throughout the festival. You can sign up spontaneously with pen and paper on site or if you would like to save a spot beforehand, drop me an email at kasia@mozillafoundation.org with a short description of the session.

In any case, I would like to invite you to pass by the hub and say hi. We will start with an informal "Research and coffee grinder” get-together at the beginning of the festival where people can get to know each other. Space and exact schedule for the Hub are still being decided and I will update you once we know the details. Hope to see you there!

@dvigneshwer

This comment has been minimized.

Show comment
Hide comment
@dvigneshwer

dvigneshwer Oct 1, 2017

Collaborator

Hi @traceypooh ,

As MozFest is approaching we require the following information from your end to better support your session in the Decentralization learning forum space. You can get back to us by replying to this issue or emailing us directly, whichever communication channel is convenient to you.

  1. Please confirm that you will be able to attend Mozfest 17
  2. Provide us a brief outline of your session topics and time estimates
  3. Does your session require any additional materials or electronic equipment other than a projector and general office stationery?
  4. Please let us know if you want to make any modification to your session proposal

Thank you! Please don't hesitate to contact us if you have any queries.

Collaborator

dvigneshwer commented Oct 1, 2017

Hi @traceypooh ,

As MozFest is approaching we require the following information from your end to better support your session in the Decentralization learning forum space. You can get back to us by replying to this issue or emailing us directly, whichever communication channel is convenient to you.

  1. Please confirm that you will be able to attend Mozfest 17
  2. Provide us a brief outline of your session topics and time estimates
  3. Does your session require any additional materials or electronic equipment other than a projector and general office stationery?
  4. Please let us know if you want to make any modification to your session proposal

Thank you! Please don't hesitate to contact us if you have any queries.

@traceypooh

This comment has been minimized.

Show comment
Hide comment
@traceypooh

traceypooh Oct 23, 2017

@kaodro hi Kasia -- sure I'd be happy to participate. I focus on TV and video -- though I've been focusing a lot on artificial intelligence and machine learning recently (especially image matching and face detection/tracking).

traceypooh commented Oct 23, 2017

@kaodro hi Kasia -- sure I'd be happy to participate. I focus on TV and video -- though I've been focusing a lot on artificial intelligence and machine learning recently (especially image matching and face detection/tracking).

@traceypooh

This comment has been minimized.

Show comment
Hide comment
@traceypooh

traceypooh Oct 23, 2017

@vigneshwerd whups, sorry I've been at many conferences and vacation so been trying to sort out loose ends...

  1. Yes, confirmed will be there and can't wait. Arriving Wed...
  2. Rough outline is:
  • overview/introduction in archive.org, and TV Archive
  • go over of file checksums and mod times and external blockchain-based integrity proofs
  • discussion on AI in use in TV Archive
    • audio fingerprinting (Duplitron)
    • OCR of chyron "lower thirds" (TV Third Eye)
    • face detection and tracking of public figures (Faceomatic)
  • TV clip-based JSON format metadata
  • "There Goes 2 Weeks -- deep dive into image matching and facial recognition"
    • pixel diff algorithms
    • perceptual hashing
    • Siamese "one shot" CNN recognizers
    • tensorflow and GoogLeNet Inception face training sets
    • OpenFace implementation of FaceNet
  • Ethics open discussion of AI
  1. just a projector or large-screen TV would be great. I'll have laptop and HDMI cable...
  2. not really -- though I will probably add in a bunch more detail on a section related to image matching and facial detection. I realize it's probably too late to change the online guide slightly, but that's probably OK, too.

traceypooh commented Oct 23, 2017

@vigneshwerd whups, sorry I've been at many conferences and vacation so been trying to sort out loose ends...

  1. Yes, confirmed will be there and can't wait. Arriving Wed...
  2. Rough outline is:
  • overview/introduction in archive.org, and TV Archive
  • go over of file checksums and mod times and external blockchain-based integrity proofs
  • discussion on AI in use in TV Archive
    • audio fingerprinting (Duplitron)
    • OCR of chyron "lower thirds" (TV Third Eye)
    • face detection and tracking of public figures (Faceomatic)
  • TV clip-based JSON format metadata
  • "There Goes 2 Weeks -- deep dive into image matching and facial recognition"
    • pixel diff algorithms
    • perceptual hashing
    • Siamese "one shot" CNN recognizers
    • tensorflow and GoogLeNet Inception face training sets
    • OpenFace implementation of FaceNet
  • Ethics open discussion of AI
  1. just a projector or large-screen TV would be great. I'll have laptop and HDMI cable...
  2. not really -- though I will probably add in a bunch more detail on a section related to image matching and facial detection. I realize it's probably too late to change the online guide slightly, but that's probably OK, too.
@kaodro

This comment has been minimized.

Show comment
Hide comment
@kaodro

kaodro Oct 23, 2017

@traceypooh great!Just pass by our Hub (first floor, Fri-Sat) and you can sign up for a session spontaneously or shoot me an email at kasia@mozillafoundation.org if you would like to book a spot in advance. We will have tables and screens, you just need your own laptop and in case you want to present sth, be prepared for very weak Internet. In any case please come to our Research & Coffee grinder get-together on Saturday 9:45, right after the opening speeches (https://guidebook.com/guide/114124/event/16880569/). Looking forward to meeting you!

kaodro commented Oct 23, 2017

@traceypooh great!Just pass by our Hub (first floor, Fri-Sat) and you can sign up for a session spontaneously or shoot me an email at kasia@mozillafoundation.org if you would like to book a spot in advance. We will have tables and screens, you just need your own laptop and in case you want to present sth, be prepared for very weak Internet. In any case please come to our Research & Coffee grinder get-together on Saturday 9:45, right after the opening speeches (https://guidebook.com/guide/114124/event/16880569/). Looking forward to meeting you!

@traceypooh

This comment has been minimized.

Show comment
Hide comment

traceypooh commented Oct 30, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment