diff --git a/src/assets/blog/9-16.jpeg b/src/assets/blog/9-16.jpeg new file mode 100644 index 0000000..6e16cdb Binary files /dev/null and b/src/assets/blog/9-16.jpeg differ diff --git a/src/assets/blog/9-17.jpg b/src/assets/blog/9-17.jpg new file mode 100644 index 0000000..eb2940e Binary files /dev/null and b/src/assets/blog/9-17.jpg differ diff --git a/src/assets/blog/9-23.jpg b/src/assets/blog/9-23.jpg new file mode 100644 index 0000000..96eec49 Binary files /dev/null and b/src/assets/blog/9-23.jpg differ diff --git a/src/assets/blog/9-29.jpg b/src/assets/blog/9-29.jpg new file mode 100644 index 0000000..2246e00 Binary files /dev/null and b/src/assets/blog/9-29.jpg differ diff --git a/src/assets/blog/9-6.jpg b/src/assets/blog/9-6.jpg new file mode 100644 index 0000000..601309e Binary files /dev/null and b/src/assets/blog/9-6.jpg differ diff --git a/src/pages/blog.astro b/src/pages/blog.astro index 42b65a1..e501f57 100644 --- a/src/pages/blog.astro +++ b/src/pages/blog.astro @@ -6,6 +6,11 @@ import "~/styles/index.css"; import ContentSection from "~/components/content-section.astro"; import Spacer from "~/components/Spacer.astro"; +import thumbnail96 from "~/assets/blog/9-6.jpg"; +import thumbnail916 from "~/assets/blog/9-16.jpeg"; +import thumbnail917 from "~/assets/blog/9-17.jpg"; +import thumbnail923 from "~/assets/blog/9-23.jpg"; +import thumbnail929 from "~/assets/blog/9-29.jpg"; const { site } = Astro; const description = "ML@Purdue AIGuide Blog"; @@ -48,1081 +53,40 @@ const description = "ML@Purdue AIGuide Blog"; - -

Computer Vision with Aref Malek

-

September 17, 2023

- -

Transcript

-
-

Aref is a Purdue CS major/math minor senior and the VP of the ML@Purdue club. He is - interested in computer vision and have pursued them through personal projects and internships.

-

 

-

Note: You don’t need to read the entire transcript or watch the entire video. - If you see an interesting question, you can just jump to it. Additionally there are some points in the video - where there were background noises so you may notice "jumps".

-

 

-

Resources by Aref

- -

 

-

Brian:

 Hi, my name is Brian and I'm interviewing - Aref, the vice president of the ML@Purdue Club

-

 

-

Aref:

 Hey, good to meet you. Sup, Brian?

-

 

-

Brian:

Hi, can you explain your background?

-

 

-

*** Weird noise so had to cut out

-

 

-

Aref:

Background as far as stuff goes, I really liked - Computer Vision for a while. Jacob and I both were both into that when we started like just talking and - becoming friends. I think my first like four into it was with a club called AMP, Autonomous Motors Sports at - Purdue. They had a data mining course for it where one of the graduate students on the team was like, - I'll like mentor a bunch of younger kids and you know, let them do their little research or try little - projects and figure it out. So I did that for a while and I really liked it. So much so that I thought I was - going to be a vision person for sure. I worked after that, like professionally, I worked at NASA where I - didn't do pure computer vision I worked with a bunch of like Google Cloud AI tools. Basically try to put - together a project that they could use internally. In the fall, in the second semester of my sophomore year, - I worked on a project called VEX. It was like VEX Robotics. Right now, Nick, I believe is the lead for that. - Basically, it was just like a robotics competition that's done around the country every year where - pretty much you try to just pick up rings or some sort of objective and what's special about it, I guess - the semester that we did it, is that we were actually using a purely like a ML approach, meaning that there - was no deterministic algorithm. Like if I see the ring, then I'll go forward and pick it up. We were - trying to take a completely deep learning approach. So I did that throughout the spring of my sophomore year - and learned a lot more about computer vision. There was a little bit of reinforcement learning involved as - well. I didn't touch that side as much, but that's just some stuff that we took a plate around with. - Heading into that summer, I, very diverse background, but I'll just keep going. That summer, I worked as - a software intern at Amazon. I worked actually on a computer vision product, but I was a purely software - dev. I was working like on the back end for that machine. I wrote a little bit of like systems code, you - could say. And then after that, I had a little bit more research experience. I worked with Professor Bera in - the Ideas Lab. I worked under a visiting student at the time on a project that was like a speech to facial, - like a facial motion synthesis. And I'll explain what that means later. But I worked on that for a - couple of months before I spent this past summer at NASA again in Virginia, where I worked on a project for - like wildfires. Now I work as a full stack developer at Tesla for the Supercharging Network. So very long, - but we'll piece that apart as time goes along.

-

 

-

Brian:

Yeah, so like I had like questions about, you know, a - little more, you know, in-depth detail of your background. So I see that you participated in the autonomous - motor sports club. So can you just explain what that club does? And, you know, your specific project? -

-

 

-

Aref:

 Yeah, I mentioned a little bit at the beginning, - but let me like start from the top. So AMP stands for Autonomous Motorsports Purdue. It specifically is a - club that works on a race car that's used at the Indy 500. And what that means is that not like the - actual, like, nationally televised Indy 500. There's a thing called the Indy Autonomous Challenge. And - what was special about that project is that cars will pretty much race around that like 500 track with - obstacles and cones and things like that. But what's special is that it was entirely AI based. People - use different approaches, right? Some people would hard-code it. Some people would try to like use deep - learning. But what was special about it at its core was that nobody was actually touching the car when it - moved. All the maneuvers that it did, all the racing that it did was completely by itself. AMP, as far as I - was concerned, was like a, was a VIP project from Purdue's Datamine, which I believe many first and - second years take apart of. At the time, what I worked on there is that they were trying to figure out like, - can we use a network, like just a one-and-done network to pretty much predict where the car is going and - what it's going to do. So what we worked on was that given some environment that we make, let's say - that we have a model of what the actual racetrack looks like in Unity or some sort of VR environment, can we - train the network that's in this little VR environment and actually successfully routes around? I worked - a lot on the vision aspect of that. I learned a lot because I'd started at zero, right? So it was a - really rough start and then we got a little bit better over time. That's the general overview. I worked - on that as like a VIP student for a while and then for a couple of weeks, not too long, I was just talking - with and tried to help out the actual graduate team that works there. The guy that I did that with - originally is now like the team lead of the entire Purdue team and he's like a senior. So he's doing - very good for himself.

-

 

-

Brian:

Okay, so I saw this YouTube video about race cars. And - so I have very limited knowledge on it, but I know that for race cars around turns, you want to go slow, - right? And there's like an optimal sort of planning. So did you incorporate that somehow into your - predictive algorithm?

-

 

-

Aref:

Actually, yeah, that's a big thing that you - struggle with when you're collecting data for these networks, especially from a vision point of view. If - you just take the naive approach of like, okay, here's a car. I'm at the wheel as a person driving, - I'm just going to predict how much do I turn the wheel and how much am I going to press the gas pedal. - Because I mean, you're certainly not going to slow down a bunch when you're driving, right? - You're just everything on the slow as you don't fly up the edge, but you're trying to go fast at - the end of the day. So anyways, if you took the naive approach of just predicting those two things when - you're driving, what happens is that because the car in general is just going straight and then just - trying to turn at specific points, right? The roundabout, or I guess I'd call it the turn. What happens - is that when you actually train the network, it's pretty much only used to like, if it averages out - where it's turning, it's kind of like slightly going to the left, right? Not like a sharp turn - that's usually made as you round the corner, but it averages it out just like something in the middle. - And that sucks because when you actually drive, you're in a situation where I'm going to go, I'm - going to go straight, I'm going to go closer and closer to the edge of the curb, and then I'm going - to completely fly off, right? So anyways, when we were designing like our data, like when we actually like - built the data that it was going to train on, we had to make sure that the car was able to like swerve off - the lane. We built in like situations where the car would drive, fly off the lane, and we punished it for - it, obviously. And we would also start to reward the car for following a series of points. You could imagine - when you're driving on the road, you pretty much predict like, either I'm going to syay straight on - the highway, this lane got backed up or it's closing off, so I'm going to change my path to merge - onto a different lane, something like that, right? Pretty much we had to think about how the where the car - is going to be, rather than how are we actually going to move the car at this instant.

-

 

-

Brian:

Okay, so when you did your, you know, when you like - set up your training data, did you like already pre-compute like the optimal path? And then that is what the - virtual car is going to follow?

-

 

-

Aref:

Yeah, so we would, long story short, pretty much we - would like have a set of different driving styles. One where it would wave back and forth, one where it - would take wide turns, narrow turns, we would vary the whole mix, right? And what we do is that for any - point in time that the car was at, this car would actually be looking like maybe 10 seconds in the past, - right? And the data that it has or like the training data is like, where was the car at every single second, - 10 seconds from now? So let's say that I was making that, that was that turn that you said, right? So - that means that at second one, I'm here at second two, I'm a little bit further actually closer to - the grass, meaning I'm like closer on the turn, which means I lower my centripetal force, right? And - something like that along the turn so that when the car goes wide or something like that, in the actual - training, we would know to punish the car because it didn't follow the path that we set out. Yeah, so we - gave it paths to train on pretty much.

-

 

-

Brian:

 so is there like, like an already existing sort - of optimal path algorithm out there?

-

 

-

Aref:

There actually is. We were trying to keep up with this - one paper from NVIDIA, I think called PilotNet. And what they did is pretty much at any given moment, the - network will like cars here, it has like four different like paths open to it and it's basically just - waiting the values of each one. So one network pretty much did the job of like, here's your eight paths - or something. The second network said, here's the optimality of each, and then the car would choose - where to go from there.

-

 

-

Brian:

Okay, so I wanted to move on to your NASA intern - experience. So I saw that you worked on wildfire localization and can you go more into detail about, you - know, the specific algorithm you use, what, you know, challenges there were with the problem, etc. -

-

 

-

Aref:

 Yeah, I won't talk too much about the - algorithms. I'm supposed to be working on it still

-

That's a whole different conversation. But anyways, I'll give you the - overview of the problem. There's an existing paper already that pretty much says like, NASA had access - to these drones, they were basically like from the army, they were called predator drones, they're used - for very questionable things and stuff like that.

-

 

-

*** Weird noise so had to cut out

-

 

-

The reason why it's important is that there's this big drone that can fly - super high in the sky and it's huge. This thing is like maybe like a bigger than a school bus in length - and super wide. And so what's special about that is that you can put a whole bunch of sensing data. So - like the mid 2000s, up until like the mid 2010s, when there was fires in California, NASA was allowed to put - like this thing called the AMS sensor, it's like autonomous modular sensor. And so what it did is that - when you put it on, it would have not only access to like general camera, like your iPhone type camera, - you'd have access to infrared readings, you would have access to thermal readings, you would have access - to a bunch of like different sensors. And so what's special about that is that you basically have this - really good problem set of like, here's all the fires in California at different times of day, and - different types of weather. And so regardless what the weather is like, whatever, regardless of what the - lighting is like, we can tell what fire looks like, right? Because we can tell that, okay, when it's - infrared, even if it's cloudy or something, we can still see the heat rays make it to the sensor. No one - ever took a deep learning approach to that. And so the paper that was done by interns about two years ago, - was the first to say like, Hey, here's the suite of stuff, let's make a bunch of networks that are - light, that are super lightweight, that can fly on like a less solidified drone, but still get the job done. - So what that means is that like, even if we don't get access to like a super high, fancy done, -

-

 

-

*** Weird noise so had to cut out

-

 

-

we could still get these networks benchmark them for whichever one's the best and - then fly it on the drone and collect the data. The reason why that's important is that right now, the - way that the forest system does it is that whenever there's a fire, they fly some like fire spotting - drones over it, they label the data, and then they're they're ready to say like, here's where - the fire is in like 12 hours, which is problematic, right? It's a little bit of a slow response. So our - dream, so our dream with the thing is that we want to be able to have a lighter drone, right? It might not - have the full sensor suite, but whatever it has, we load our like our network onto it. It flies over - wherever the fire is, it does our little detection, and then at least you have a 80% of the way to the - solution, right? So that even though you'll have to have someone in the loop to label it, you won't - have to have a 12 hour turnover loop. That's something that we really want to improve.

-

 

-

Brian:

 You mentioned that this drone uses a hyper - spectral like sensor or something, uses many different types of, you know, sensors like infrared, I guess, - like RGB too. My question is, is it possible if you could maybe just hard code this? Like, I'm not - really sure. But like, if the thermal rating is above X, then there's a fire. So what would be the - advantage of using deep learning?

-

 

-

Aref:

 So the reason why is because it's really hard - to capture. It's actually a very good thing that you brought that up. A lot of the existing papers - pretty much do that. There's another thing called the Landsat satellite. And it's this big satellite - that NASA sends into space. And it pretty much has almost the same suite of sensors. And you have to think - about how. expensive that is to fly that in space. Basically, what people do is that they come up with like - hard coded algorithms, like you said, where it's like, if the ratio between the thermal at this area and - like the IR this area is above 0.72, then that's one of the factors that may lead to fire. The problem - is that you'll never get it perfect, right? Because if it was perfect to get it, then there wouldn't - be a space or if there was a way to get it perfectly, then we wouldn't have a 12 hour turnover time to - detect fire, right? Yeah, let's just think about your example, right? Let's say that if thermal was - high enough and IR was there, then we're going to say that there's a fire. Now let's think about - it's a hot day in Texas, we're flying over, there's going to be a fire, but we also fly over - like Houston, right? Houston's hot, the sun's bright, and that means that sun is reflecting off the - ground, that asphalt, like the highways, is hot as hell, right? So that means that it's going to pick up - high IR and high thermal. Even though if I looked at it, right? It's just black asphalt. That's an - example, right? Because that was one of our false positives. When we try that algorithm, when we looked at - parts of Southern California, it would mark roads as fires very often actually.

-

 

-

Brian:

Okay, so I guess this deep learning approach is a more - general approach?

-

 

-

Aref:

Yeah, also the important part about it is that if you - can generalize for RGB, like saying like, okay, when I have RGB like this, I tend to have IR and thermal - like this, right? And that means that I'm very likely to predict that this is a fire. Let's say that - it was a pretty cloudy, like it was very smoky when I flew over the, when I flew over Southern California, - but I saw these red patches and when I looked at the IR and the thermal, it gave me indicators that there - was fire and the label was fire. So that means that even when we fly it, this is actually one of our - hypotheses, is that like even when we fly it without access to IR and thermal, because the RGB is associated - with that outcome, which is fire, we would still have a strong predictor for it

-

That's something we have to test and that's something why I have to keep - working.

-

 

-

Brian:

Okay, my next question So I saw that

-

 

-

Aref:

 So let me know if that doesn't make sense. - I'll clarify. I tend to speak pretty fast

-

 

-

Brian:

 No, I think it made sense. Okay, so I think - earlier you also mentioned that you joined the Ideas Lab. And I saw that you did research in the VR area. So - can you explain what type of research you did?

-

 

-

Aref:

Yeah, the paper is actually already out. I'll talk - about like what they did because I actually, I stopped taking part in things happened and whatever. So - basically the dream of that project was that like, let's say that we're in a voice room, right? So - we're doing Zoom like this. And let's say that we're not very comfortable showing our face, but - we have an avatar, like we have like the Apple Animoji, right? So let's say that when I'm speaking, - right? Now that I've learned how I've learned, this is how I look when I'm mad, this is how I - look when I'm surprised, things like that, I'm able to just take in my language, right? My actual - speech and figure out what my face would look like at the time. So the whole point of this is that like, if - I'm in VR, right, and I don't have access to compute or like track my face purely, I'm able to - turn in my speech to an expressive talking head pretty much. So we call it speech to effective gestures. But - what that means is that like, even if I'm just speaking, you can figure out what my emotions like, - right? And if I had a 3D like a thing about a 3D cloud of like a human face, right? Think about Snapchat - filters. When I'm angry, right? It's able to figure out what my face looks like at that time. So - that's what we're trying to figure out

-

 

-

Brian:

So it tries to like incorporate the semantic meaning - behind your sentences, and it just maps out onto like a avatar

-

 

-

Aref:

 Yeah, but not only that, right? I'm also - speaking, the avatar speaking, right? So first off, it has to learn how mouth movements work, one, but also - how mouth movements and facial expressions combine.

-

 

-

Brian:

Oh, so does the model also take into account maybe the - volume of your voice? So like, maybe if I'm, I don't know, like, I can't really think of an - example, but maybe if I'm loud at one point, that might lead to a different expression as opposed to - when I'm, you know, whispering or something?

-

 

-

Aref:

 Yeah, actually, there's a thing called VAD in - the emotional space. I forget what the D stands for, but one is for valence and arousal. So you could think - of like the arousal was like how energetic this level is, right? So if I'm very angry and I'm super - loud, my arousal is pretty high, right? So like, part of like when you express emotion is that like, - obviously, you're going to classify what's your emotion, whether mine might be inquisitive, yours - might be like questioning or to use like a common emotion angry, right? So even though I classify that, - right, I have to figure out that like, based off this like sound snippet, what is like the face look like to - go extra loud or things like that, right? That's part of your training data.

-

 

-

Brian:

Okay, so it takes that into account the VAD? What did - you say?

-

 

-

Aref:

 VAD is just an example of what I'm saying. In - our example, like, the whole idea is that we synthesize speech in the motion, right? And then once we - synthesize speech and motion, let's say our training data with someone speaking and then their audio - file, right? So when someone's speaking, obviously your face changes when you're like making more - noise, right? So you could think of it that like the facial motion that's expressed with that is now - associated with that sound bite. But on top of that, right, you also have a label for what emotion that - is

-

So when I'm speaking loud and I'm saying that word and I'm angry, it - looks different than when I'm speaking loud and I'm explaining how angry I am to you.

-

 

-

Brian:

Okay, wait, so, so from what I'm gathering, the - algorithm that you worked on, it tried to predict your, you know, some facial expression of an avatar based - on how you speak and not the words themselves. So like, if you said the word, I'm not really sure. Like, - let's say that I said something angry, like a sentence that would indicate that I'm angry, when I - talk to it, what I said, like, as if I was passive, then the face would look passive.

-

 

-

Aref:

Yeah, you can think of it is that like, if you close - your eyes and you hear me talk, now that we've talked for a little, you can imagine what my face looks - like at the time, right? So you're imagining that you close my eyes. And let's say you imagine your - mom yelling at you, right? You know exactly what your mom's face looks like. And so the whole idea is - that like, if I could synthesize what you look like, when you speak with this assumed emotion, right? Then I - could make a better construction of like a realistic person speaking. The whole point of the emotion is that - as people, right, when we conversate, we don't speak monotone, right? It's a it's a hard way to - actually express ideas without using emotion. And so our intuition was that if we feed an emotion to the - context of generating a face that speaks, right, you get a much. more realistic output

-

 

-

Brian:

Yeah, I mean, that's really interesting. Wait, so - when you actually train your algorithm, is it like, you like, somebody says a sentence, and then there's - like some sort of like camera or something like showing the exact 3D coordinates of like the outline of your - face. And then you just feed that into the model.

-

 

-

Aref:

Yeah, there's actually, so like, I can explain, at - least when I looked at it, this paper has been worked for like six months after I left. So they did a lot of - stuff that was different than when I was saw it. But I can explain to you what I understood at the time. The - way that it looked like to me was that like, let's say the training data is like a bunch of people that - have their facial emotions, like corner parts of their whole face tracked, right? And when they're - speaking, yes, we have this audio clip, but also we have a label for what it looks like so that this is an - angry person saying a sentence. And this is what their face looked like at every single second of that clip, - right? Meaning that like, this is what their whole like thing about the Snapchat facial filter where it has - like a whole polygon, this is what the polygons look like at every moment. So the whole idea is that - you're not really reconstructing a person's face, but that polygon, because you map those those - meshes and like in VR in any sort of game, right? Your your meshes was actually mapped on top of the - skeleton structure. So you can imagine that we're just predicting what an average face would speak at - that time. And then you map stuff on top of it.

-

 

-

Brian:

So I mean, I can see how this can have any - applications or like VR. So I was going to ask, what is your opinion on the metaverse? Like, is that gonna, - you know, pan out?

-

 

-

Aref:

I hate to say I have a crystal ball because I - don't, I predicted wrong a lot of times. But I do think that it's a little bit silly not to imagine - that our lives aren't going to get more integrated with computing as life goes on. I mean, before, what - was the bar for like AI, like, okay, I would never be able to beat somebody at chess, that's a long gone - game, right? But never be able to beat somebody at Go, that's that's gone. It will never be able to - drive a car. I mean, I work at a company that literally challenges that, right? And then now it's like, - it'll never be able to write essays and the bar for essays has now risen. It's a moving game where - like, I don't know what I define is like, we're only gonna, we're gonna live real life SAO - pretty much, right? If anyone out there watches this anime. I don't know if that's how I describe - it, but a lot more of your life will be interacting with different forms. Let me rephrase this. A lot more - of your life will be interacting online, right? And so however we can make that more expressive and more, I - guess, core to the human like emotion, I think that'll be very valuable as time goes on

-

 

-

Brian:

Okay, so what made you interested in AI and what kind - of resources did you use to learn more about it at Purdue?

-

 

-

Aref:

Good question. I'm interested in AI, but I would - also call myself a generalist

-

I don't think I'm like a superstar in any specific field. I just like to - learn a lot. What I think helped me a lot is honestly, I, I bother a lot of people like I ask, I like ask a - lot of people like, Hey, I want to learn this, I'm doing this, do you have any tips for it? Let's - say for example, I wanted to learn more about doing a research skill project, right? I would, let's say - at this time, I tried a couple of tutorials, I made a couple of networks, I like the stuff I want to take it - more seriously. I would shoot emails to professors saying, I like this, what you do is similar

-

Let's say what Professor Bera, the ideas lab, he works on turning human like - emotion into like a the AI and robotic space. I like those things are worked on human expression into, in a - like a computer vision product, meaning that I would interact with my real world and I turned that into like - a something in the computing space, it's called AirDraw.

-

 

-

Brian:

 Airdraw. Oh. I think I saw a video, it's like - where you just like you draw in the air

-

(here is video: https://arefmalek.github.io/blog/Airdraw/)

-

 

-

Aref:

I made that. But anyways, what's important about - that is that like, I always just reached out and then I said, Hey, I want to learn a bunch. Do you have any - time for me? And I try to find my way there. Resources that I recommend to everyone, I would recommend that - you just start to read more, meaning that you don't have to understand 100% get to like 50%, 70%, 80%. I - would use stuff like d2l ai. I would use stuff like Andrej Karpethy.

-

(He has a youtube channel)

-

He's really brilliant. I just love to listen to him speak and just teach. And - also like Purdue, I would say Purdue curriculum is trending more towards AI. That's why we have a whole - major for it now. But a lot of the courses will help you gain foundational knowledge or at least understand - where the history of AI was

-

 

-

Brian:

Yeah, yeah, I mean, I know that AI majors, they need - to take like some philosophy courses or something

-

 

-

Aref:

Yeah. But let's say that it's a very broad - question

-

 

-

*** Weird noise so had to cut out

-

 

-

Aref:

I'll say like this. Let's say imagine I'm a - freshman student, no way I experience, I really like ML@Purdue. And let's say I don't think I'm - like qualified enough to join an ML project here

-

How would I make myself a little bit more presentable or try to improve my chances? I - would say first off, with your math skills, what can you learn and what can you make, right? So let's - say that at first I was able to make, I was able to follow a couple of tutorials, I learned a bit of Dumpi, - then I learned PyTorch, I made a couple of CNNs, maybe I made an LSTM, I made a couple of networks, right? I - learned how back propagation works, I learned how to make stuff, right? That's considered AI. Now, if I - have that knowledge, could I do it at a little bit of a higher level than tutorial, right? The resources - aren't real available to me, but could I learn given some mentorship of like a lab or something? I would - do that, I would reach out to a lab and say, hey, I'm starting out, I really want to learn and I like - what you guys do for X, Y, and Z

-

I noticed that you guys probably need help with this, could I help? I would try to do - that for a while. And then once I realized that I can thoroughly understand the baseline, which is like - tutorial level, I understand what those things are doing. And then if I was doing research, I understand - that these requirements come, then it just shows that I'm someone who can learn, right? I don't - master computer vision, I don't consider myself a master at all. But I think I can learn, right? If the - situation comes down to it. And I think that's the most important thing you can show on a club app, a - job app, research application, anything, right?

-

 

-

Brian:

Yeah, so you just mentioned research. So would you say - that it's easy to reach out to professors here for research opportunities?

-

 

-

Aref:

Yes and no. Certain professors are very open to - accepting new people, certain professors aren't. For example, Professor Bera’s lab has ballooned - from like 10 to 15 students to like 30 to 40. Their students are pretty much always looking for someone else - to come in. And if they're not, I mean, there's different professors that are very open to students - as well. There's a whole blossoming of AI in the past year or so. So I would say that like, if you have - a way to say like, hey, I know that you're working in AI, I have these skills, I'd like to learn a - little bit more. Could you take me in? It's, I'd be hard pressed to say that the entire school would - say no to you

-

 

-

Brian:

Okay, so my final question. So this is sort of a - general open-ended question

-

How do you think AI will shape our world? It's a very broad question.

-

 

-

Aref:

Yeah. That's a very, very broad question. - That's like saying like, how will computer science change the world? It's, so I will answer it from - how, if I was asking this question, what I would think it means, and then I'll try to answer that. So I - was asking this question, I would imagine like, how will AI shape, let's say the face of CS, the face of - the tech industry, and also how everyday people interact with their computers. Let's just keep it - simple, right? So at the base level, I would say that the way I would describe the whole AI trend is kind of - no different than any of the other trend that's occurring right now. Let's think back like three - years ago, right? And let's say that people thought that banks were going to die out, we're going to - have blockchain instead. Yeah, like cryptocurrency was going to be the new US dollar or something like that, - right? This was definitely things that people said, right? And now when you open LinkedIn, you're going - to hear like at least five people in a row say like, AI won't replace you, of course, they're using - AI will, and they'll all say the same thing, right? And so I would say from that point of view, as - anyone that's thinking about like their personal investment, it's another tool that's being - created, right? Obviously, you'll have to advance. I'm sure someone that says I'm trying to look - for a job, but I don't know how to use Google will struggle, right? And so the same will be true with - chat GPT, with all the other LLM tools and stuff like that

-

But what I wouldn't fall into is I wouldn't be one of those people that's - going to say like our robot overloads are just bound to come in and take us over, right? That I think - we're a bit away from. And the reason why is because I'm hesitant to trust any sort of doom and - gloom approach. I think they're trying to profit off you at the end of the day. That's what I'd - say. So someone that's worried about themselves, because I am too, right? I'm sure that a lot of the - work that I do, if it's not in pure research, will eventually become replaceable. That's a sign that - as a humanity, we're growing, right? If in like, I hope in 10 years, I won't need to know React. - That's a sign that the field hasn't like advanced at all. You know? So I would say that like as - someone who's like worried about the advancement of technology, it's part of the course. I mean, we - automate people all the time. When the car was invented, a lot of people lost their jobs. Like a lot of jobs - were to take care of horses. We don't need horses anymore. We have a car, right? But that's bound to - occur. And it's a little sad to know that it's bound to happen. But also it's part of the - course. Now, what's the second way I would talk about this? What does that mean for someone interacting - with their computer, right? The way that I like, we look at people use computers and like the 50s and 60s in - NASA, when like the moon landings were happening versus like now are totally different. I mean, the - calculator has more power than the first rocket to the moon. That's, you know, and what does that mean? - That means that at every scale of technology is about to change. Basically, every single company is racing - to figure out like what's the best hardware design to train LLMs. Like I said before, it's publicly - stated this isn't like insider knowledge. Tesla had self-proclaimed top 10 supercomputers in the world - just to train an AI that just sees things. But now you can imagine in order to train a chat GPT or any sort - of model like that, every single query you send to Chat GPT uses eight NVIDIA GPUs on average. That's - like the stat that's thrown around. So you can imagine that like if it's a one query per person to - eight GPUs, GPUs are expensive. That's not going to last forever. Every company is racing even at the - pure hardware silicon level

-

Like how do we make a new design just like handle this, right? That's why - Tesla's stocks skyrocketed because of Dojo. That's why Google is making things called TPUs. - That's why NVIDIA makes so much money. That's why AMD is trying to compete with them. That's the - base level. People are making new ML frameworks. NVIDIA obviously has CUDA. That's like what they run - while processing other GPUs. I'm sure that Tesla's going to make something

-

I don't know, but they might AMD is bound to make something because they want to - compete. Apple's going to do the same thing. Basically at every single level to handle this new disrupt - in technology, we're going to change how we design things. And eventually, I think AI will also be - something that's similar to commodity. Let's think about the iPhone 16 years ago. The iPhone is - almost as old as us. So if you think about that, when people first saw it, what do they think that iPhones - in 16 years would look like? Do they think it would look like what it does now? Or do they think it's - going to be like, I'm going to open my eyes and I see Jarvis around me? Let's be honest. They - probably imagine the second. I think what we tend to forget is that at the end of the day, technology has a - financial incentive. So however fast it can be a commodity is probably where it's going to head. And it - probably won't leave. That's why Google is like the original algorithm that Google runs can never - run the whole internet today. All this SEO stuff that people figured out will make it pretty much - inaccessible. So what I'm getting at is that as someone that thinks about how they interact with their - computer, I think what you're failing to realize is that whatever commoditization we find for AI, well, - at the end of the day, be a product. I mean, we're America, we're a very capitalist place. So - we're trying to find a way to make this usable for people as a product, but also how do we actually meet - our goals, aspirations, career goals, finance goals, whatever. Does that answer the question? It's a - very open-ended question. So I'm sorry that I gave an open ended answer.

-

 

-

Brian:

No, That was the point. I wanted everyone I - interviewed, if I asked them that question, everyone will have completely different answers. So yeah, that - was great. So yeah, thank you so much for allowing me to interview you Aref.

-

 

-

Aref:

Yeah, it's my pleasure. Appreciate you doing this - service pretty much. Some of the smartest people I know are interviewing with you, and I'm just really - excited to hear what they say. Yeah

-

 

-

Brian:

Well, I just like cool stuff, and I'd love to hear - more

-

 

-

Aref:

 Yeah, cool

-

 

-

 

-
- - - - - - + +

AI in Classrooms with Dr. Lindsay Hamm

+

September 29, 2023

+ +
+ + + +

Robotics + LLM with Jacob Zietek

+

September 23, 2023

+ +
+ + + +

Computer Vision with Aref Malek

+

September 17, 2023

+ +
+ + + +

AI with Dr. Eugenio Culurciello

+

September 16, 2023

+ +
- - - - -

AI with Dr. Eugenio Culurciello

-

September 16, 2023

-

Transcript

-
-

Dr. Eugenio Culurciello is a professor at Purdue in the Biomedical Engineering (BME) - department. He is also the professor advisor for the ML@Purdue club! He is interested in chips, - neuroscience, and neural networks.

-

 

-

Other information:

- -

 

-

Brian:

My name is Brian and I am interviewing Eugenio - Culurciello.

-

 

-

Eugenio:

Nice to meet you, Brian.

-

 

-

Brian:

 Thank you for this interview. Can you explain your background?

-

 

-

Eugenio:

So I was actually trained on analog and mix signal - chip, micro chip design. I worked in a neuromorphic engineering area at the beginning. So the idea was to - figure out how to replicate human abilities in silicon or in artificial technologies. And then later I - started working on machine learning and neural networks when I met Yann LeCune. I was at Yale university and - he was at NYU. And then we met and we started working together. First, we started developing some micro chip - accelerators for deep learning. That was like about 15 years ago. Yeah, it's and yeah, this area was not - really popular. And then, when I joined Purdue in 2011, I continued and my group developed about five - generations of machine learning hardware, but we were also developing neural networks models and like my - group has been also instrumental in like the beginning of by Torch, which was the precursor of PyTorch. -

-

So yeah, we work on all these areas. And right now, I'm a professor at Purdue and - I work on machine learning and AI, and I try to work on multimodal large model at the moment

-

 

-

Brian:

I also saw that you taught some deep learning courses - here

-

 

-

Eugenio:

Absolutely. Yeah, actually, ours was the very first - deep learning course at Purdue we started 2011, 12. And, it was the early days. And yes, we've been - teaching deep learning ever, ever since at least once a year.

-

 

-

Brian:

Yeah, I was actually interested in how come the deep - learning courses were taught under BME instead of CS?

-

 

-

Eugenio:

I was, yeah, I was I think under BME because a lot - of our work was like taking inspiration from biology, neuroscience and psychology. And, but the final goal - was like replicating the human brain in our own software. And when I when I joined here, there was a few - people teaching neural networks basic fundamentals, but we were really the first class to teach the most - more modern deep learning, including convolutional neural network and so forth.

-

 

-

Brian:

Okay, so I wanted to ask you about this, this paper I - saw last year. It was called DishBrain and it's where researchers grew brain cells, and then they teach - it to play the video game Pong. Because you have a lot of experience in hardware, I was wondering whether - this is a plausible idea in the future, where you use real brain chips, and then these chips are - specifically trained for certain tasks.

-

 

-

Eugenio:

I think it would be a good idea also because you - know like currently all the machine learning, models that we built is highly inefficient compared to the - brain, like on the order of 1000 to 10,000 times less efficient. And so like if you could interface to - living and breathing live neural networks that would be awesome. One of the issues there is that we - currently don't have the capability to really interface that a very large number of cells or high - throughput. So the input and output is a bit limited. But that said, yeah, I think it could be very, very - promising area if we can figure out a way to to grow the right number of electrode to interface with the - tissue.

-

 

-

Brian:

Okay, so it might be hard to like add a like image - sensor or something like directly to this brain shape, I guess

-

 

-

Eugenio:

Yeah, I think that the problem is the number of, you - know, in the brain, like maybe in a millimeter cube you have hundreds of thousands of cells right? But in - terms of like a physical connection we couldn't have that many, right? There's probably a million of - connection in there, but we can have the right, you know, the very large number of connection also, or all - the connection that we have with from computers to biology like using some kind of electrical wires and they - usually living cells don't usually like that that they reject that and that's also a problem for - wearable technology that are invasive and it's it's always been a bit of a problem. I don't - really know how to solve it at the moment, but yeah, there's many colleagues here who work in that area - also.

-

 

-

Brian:

Okay. My next question. So, what do you think about - the embodied Turing test, where instead of judging an AI based on how we can mimic human intelligence, you - judge it based on how good it is at doing certain animal related tasks. For example, an artificial beaver - building a realistic dam. And do you think this approach is more promising than current methods focus on - focusing on like, you know, like linguistic simulation

-

 

-

Eugenio:

Yeah, I honestly, you know, yeah, honestly, I agree - with that like I really think that you know, we don't have a definition of necessarily what is this sort - of an entity or what is another entity, you know, what is constitutes a cat and what is different from a dog - or what really constitutes a human being, because there's so many levels there and we don't really - have a formula for it. So I think that the best way for us to say whether artificial system is close to a - real one is just, it's really what it can do. If you can do the same thing then functionally they might - be equivalent. But you know, that also, you can’t possibly test all possible things that one entity - can do, because there are infinite possible combinations so you'll have to somehow there has to be some - some kind of a test and I don't know what that is even, even recently with it is a large language model, - you know, people are just started scratching the surface and they call possible there are no possible tests - on these on these systems and try to figure out what capability they really have and they don't really - have. And it's very unclear what what the results are what what you can do what you cannot do and at the - end of the day you have to, it's a bit of a trial and error.

-

 

-

Brian:

 Okay, like you just like that's like you just - like keep coming up with some random tasks and you just see if it does well?

-

 

-

Eugenio:

Yes, I mean, more than random tasks I guess. You - know, it's always like this. You want artificial system is always to do something. So, I guess the idea - is, you try to make it do what you want and try to make sure that it can can do more or less the thing, the - same thing that a human can so for example, you know autonomous driving right? Even there you have an - infinite number of scenario that that you could face the bite while driving a car right whether you're - human or artificial. And so you can't possibly test all of them. You also can't possibly train on - all of them. So you have to, yeah, you have to test on a lot of conditions and see what the problems are and - design the system in a way that is fairly generic and he has a bit of common sense like us. Other than that, - I'm not sure if, if one could identify a test I will say okay yes this car is good enough to drive or - not. I think we will never have such a test because it's there's always possible different - conditions right? For example, years ago I was like driving on the road and all of a sudden I there was a - ladder like blocking my entire lane.

-

 

-

Brian:

 A ladder?

-

 

-

Eugenio:

A ladder, yeah like a ladder that you walk on. And - it was like pretty tall and I mean first of all you would have to recognize it and then you have to figure - out what to do and my all the lanes nearby were like completely filled with cars and so I had to decide okay - should I just break or should I go over it and if I go over it my prediction was okay all my tires are gonna - pop. So you see my point my point is I run over it and nothing happened and the point is sometimes - there's so many different possible cases. One time I was in Baltimore driving on the highway and some - some some guy jumped off from the other lane and run through. I mean it was just all sorts of crazy things - can happen on the road. I mean you could find a crater at some point a bridge that is interruppted you know - it's all sorts of things so. So I don't think you test all these cases, right? So I'm not sure - that there is for for for leaving things that have infinite possible action if you infinite possible - scenarios in the environment I don't think there will ever be a test that can test that all their - capabilities honestly just by the sheer number of possibilities. You know it's the same for you right so - when you go and take a driving test, what do they test? They test okay a little bit of your abilities. But - it's just like a tiny subset you know 0.1 percent of what you'll ever encounter even normally. Yeah - so and then they say yeah you're ready to drive. So I guess artificial system we usually hold them to a - higher standards you know, where you're right. Which is you know good and bad but that's because we - you know we kind of have an idea statistically what a human can do. You know, on the road and where you give - him a driving license but we don't have a statistical idea what a machine can do so. The point is you - have to test a lot and at some point you'll have the same kind of statistics and you'll be okay with - that maybe.

-

 

-

Brian:

 So I saw that you had a medium blog and you have - lots of blog articles. So I read through some of them, and in one of your articles, you mentioned that - transformers are really close to universal neural networks and can handle lots of different types of data - such as vision, text, speech and so on. So do you think this is close to the final neural network - architecture

-

 

-

Eugenio:

Yeah I think so honestly like at the beginning at - the beginning of your like let's say like 10, 15 years ago people were pretty happy by crafting neural - networks and running it on some data and training it. But it was really hard to scale up and continue to - learn different modalities and I think we spend way too much time focusing on creating data sets and - creating custom neural networks for a specific data set. But at the same time people are looking for a swiss - army knife of neural networks. And one such network could be the transformer architecture, which really - surprised us you know honestly just in the last couple of years the capabilities that you know, all the data - you can do and learn when scaled up even with these very simple learning techniques like predictive learning - right? So I would say yeah that's like a really good. We're scratching the surface of it on what you - know what the real artificial brain could be. I think we have to embody it and give it more senses and train - multimodal and then try to figure out what can you learn, right? And we also need to figure out this - continual learning capabilities now how much you can continue to learn and learn online. But I think - it's, I would say now it's more exciting that five years ago. Five years ago it looked like we were - doing the same things over and over and now it's, it looks like we have foundational models that can do - much much more and so it's exciting. I hope, you know, I think by trial and error we will keep looking - and try to find something that can model our brain. And that said we're still like 1000s of time away - from terms of efficiency. The capabilities are getting better but also, you know, hasn't been really - used in robotics as much and it hasn't really improved in the physical world. So I think there's - still a lot of work to do. It's a never ending story. But it's exciting.

-

 

-

Brian:

Okay, so in another medium article that you wrote, you - mentioned spiking neural networks, and those try to actually simulate the spiking in real neurons right and - that seems like a step towards combining more neuroscience to AI. But at the same time, I know that those - networks, they still rely on back propagation, which is not very neuroscience realistic. So in the future, - do you expect that AI will continue to look almost one to one, like a real human brain to an AI, or do you - think there will still be a combination of like engineering where you just just see whatever works? -

-

 

-

Eugenio:

Yeah, I think there's going to be a, you know, - quite a few heuristics but you know the bottom line is that artificial system based on silicon are very - different from biological neural network right? So biological neural network, which are great I mean they - allow us to do all these things. But they have a fundamental problem which is that electricity doesn't - conduct very well within our body right in a physiological solution. And so they only have, they can only - transmit the pulses and over short channels I mean some channels are pretty long like from our brain all the - way to the spinal cord is like a meter long right? But in general they use this spiking because the problem - is you need to constantly reconstruct the signal that would dissipate otherwise. But in silicon we can make - really good wires right, but in biology, you can have lots of wires in a very, very small area. In silicon - we can't. We can only make like a few layers of wires and the density is very low. So things are - different. But I think, and also there's much to learn because in, like in artificial neural network, -  you can do back propagation over many layers, you know, because you can look at the very small - differences across many layers right in biology you know the noise is so high and you can only go from layer - to layer. So, I think, you know, in artificial neural networks we still have to learn how to scale up the - layer by layer learning. I don't know that might be more efficient. I'm not sure. Maybe that's - one of the ingredients of efficiency in the brain or maybe it's not. Maybe it turns out that artificial - networks with back propagation are much more efficient or because you propagate the signal very, very, very - easily and you can train them faster. I don't know if we know the answer to those questions. So - we'll have to, we'll have to look at it and learn more and try things. I guess it is also we're - stuck. Yeah, also like biology stuck with the the own substrate chemical biochemical substrate and we're - stuck with the silicon chips right? Maybe there's another medium that would create better brains - - artificial brains. But yeah, that's what we have right now. And so yeah, it's not, we can’t - answer those questions I guess right now, you know, people will still try to investigate these - issues.

-

 

-

Brian:

So this interview is more oriented - towards beginners, like in the club. I wanted to ask you, do you have any advice for students interested in - AI?

-

 

-

Eugenio:

Absolutely. Yeah. Well, I think it's, you know, - honestly, it's a really great time today because of the internet and all the code and example and things - that we can share. I think really what what a person needs is mostly drive. Like at any level, like if you - think if you want to learn there's so much material but you, you shouldn't be discouraged and you - have to figure out a way to learn step by step maybe by there's so many courses and so many levels in - machine learning that I think you can, you know, anyone can can pick it up. And one of the issues of course - also is need to learn programming a little bit. I would say you know, you start with the Python class or - something like that and then you move on to some simple, simple class or simple tutorials on machine - learning and then I think the next step is to jump into some nice project that you like. You see that this - is a ML@Purdue group is really awesome because it allows you to form a group and learn from each other and - do things together which keep you excited and motivated. So I think yeah, like if you're someone that - already joined such a group or similar group anywhere and you have a passion you can learn really easily and - you, honestly, you don't even, you don't need a university. You don't need a teacher you - don't need a professor. A lot of the stuff you can go on your own which is nice and also scary and you - just, you need a lot of passion, I think that's all. That's true for everything almost you know, but - there's some things that it's hard to learn like I can't learn about nuclear power plant unless - I work in one right? But machine learning, oh, gosh, just need a laptop. So it's so much easier -

-

 

-

Brian:

Yeah, I mean, just watching Youtube videos, I - guess

-

 

-

Eugenio:

 Oh my god there's so much - knowledge in there, right that you can learn from really awesome people and I do it too, you know, I often - like listen to lectures and ideas online and people I never met and it's really awesome. Yeah, I mean - it's just all you need is like some area that you're interested in and just to try to go deep deep - solve all the problems more problems. There's always a problem to solve. And if you don't know you - ask a group like even on GitHub there's there's amazing projects and you can join them and say what - are the problems you're trying to solve and help them out. That's an awesome way to learn. And you - can do that on your own on your laptop anywhere where you are so it's really great.

-

 

-

Brian:

-

Yeah, I mean, on GitHub nowadays like everything that's trending is just some AI - model or something

-

 

-

Eugenio:

 Yes

-

 

-

Brian:

-

Okay, my last question. So I know this, you might be a little biased, but do you - think AI is just hype, like blockchain or like NFTs?

-

 

-

Eugenio:

 Yes. Well, AI is hype because I - mean even you and I talk about AI now. I don't even like the term match because everything that - we're doing so far is mostly machine learning. You know? It's like some basic learning algorithm. - Maybe we're getting more into it now because we have this large foundational model, but I would say AI - really starts when you have like a robot and you're trying to do something in the real world and - constrained. And yeah, so I think we still need to do it. I think we don't know. I mean, definitely it - would play a role but AI in the sense of machine learning neural net or deep learnings. But I think we need - to do more work in robotics, you know, and because we're still behind so like even with all this - algorithm, we don't have good algorithm to be able to grasp any kind of object or navigating an - environment and there's still, we still don't know how to learn all these things. We still don't - know how to learn multimodal integration in a robot that you have different sensors it keep learning and - keep training and so I think that's an area that probably needs to evolve. But yes, if you know, if AI - or what you want to call it now, it's going to empower this robot and I think it'll change the world - for sure because we'll have like artificial entities that are, you know, as good or able as us or - better. They could do lots of other things right there we cannot do. For example, we could send them to - explore the universe because they live forever. They can live forever or they can replicate somewhere else. - And maybe we won't even be able to know what they find out because our life is much so much shorter - right and so much constrained by where we live in this planet and where we can reach in the short time we - live

-

 

-

Brian:

I was, when you were talking about robotics and AI, I - was like thinking about those food robots around campus. We’re going to see like 10 times more of that - or something like we're just going to see some robots delivering like like I don't know refrigerator - or something.

-

 

-

Eugenio:

 No, that's right. Yeah, I mean those things - cannot, you know, walk up the stairs or open the door and give things to you. But soon we will be able to do - that. So that's great

-

And I just hope that we'll be able to program them and make sure that they only - work for a good cause.

-

 

-

Brian:

Thank you so much for the time

-

 

-

Eugenio:

Okay, so it's been a pleasure to talk to - you

-

Yes, and you need anything

-

Contact me again

-

Okay, yeah

-

 

-
- - - - - - - -

AI Interpretibility with Jinen Setpal

-

September 6, 2023

- -

Transcript

-
-

Jinen is a junior undergrad data science major and ML@Purdue officer.  He has - deep experience in ML in both the research and industry side. Today, he will talk about AI interpretability, - which is exactly how it sounds. Building AI that is interpretable to us humans.

-

 

- -

This is a new series, where I, Brian, will interview cool AI professors/students and - talk about their research interests and how students interested in AI can get involved in research. -

 

-

Advice for beginners summary:

-
    -
  • Check out D2l.ai Dive into Deep Learning
  • -
  • Read books about ML like Python Machine Learning by Sebastian - Raschka
  • -
  • Contact professors
  • -
  • Jinen mentioned arXiv. It is an open access repository of pre - prints (not peer reviewed yet). Lots of ML papers on it and you might have seen links to these on github - if you searched up for a certain algorithm
  • -
  • Be interested!!!!!!!!!!
  • -
-

 

-

 

- -

Brian:

-

Hi. My name is Brian and I'm a freshman and my role is to - interview cool AI students like Jinen here and also, in the future, AI professors.

 

-

Jinen:

-

My name is Jinen and I'm a data science student and I love research.

-

 

-

Brian:

-

So, Jinen, I looked at your website and I saw that your research focus is on - interpretability

-

Can you explain what it is in the context of computer vision and natural language - processing?

-

 

-

Jinen:

-

Interpretability more generally is the degree to which we can understand why the - model makes the decisions that it says it does. So for instance, in deep learning, this is a problem because - most models are just, you know, hugely parameterized and you give it an input and get an output at the end - and everything that happens in the middle is very difficult for us to understand. And one way of getting - around that is using interpretability techniques to get a grasp on what the model is doing and certain - certain models are more interpretable than others and models that are interpretable by default, so you can - just read their weights and tell what's happening, are called intrinsically interpretable models and - those that need a lot of manipulation and are done generally after the entire evaluation process is called - ad hoc or post hoc interpretability..

-

 

-

Brian:

-

So wow it is raining outside

-

So, let's say I use chat GPT or something,  I mean, when I get the answer, - it looks pretty much right. So how come interpretable models are important?

-

 

-

Jinen:

-

Yeah so well, interpretability is important mostly because models like chat GPT - generally are very, very over parameterized,  And consequence of that is that they work very easily. - And if you give it an input that is within their training data set, it is very, very likely that they will - get it correct. And the training data set generally is a huge, huge corpus. And that is why it's very - easy for us as human evaluators to miss certain sorts of mistakes that it tends to make. And one kind of - important thing, especially in domains that have a very much, you know, the models are very, very important - is we need to have a sort of a degree of understanding as to why the model makes a decision. So for chat - GPT, it is mostly a low stakes environment,  So even if we do not have interpretability, it's okay. - That's fine. We don't need to know so much. But if we are taking a decision on whether to maybe - evacuate a city based on some statistical models of a tornado or any other natural disaster, it is a very, - very, very high stakes decision and it can impact a lot of people. So it is very important to know why the - model thinks that the city should be evacuated or not be evacuated because guessing incorrectly either way - is going to cost a lot in life and money,  

-

 

-

Brian:

-

So if a model is interpretable, that means that we as humans, if we see that - there's a certain bias that we know is incorrect, we can try to step in and fix that.

-

 

-

Jinen:

-

Yeah, there is one piece of work that I actually saw did just that. So what was it - exactly? There was this paper called Inventive Risk Minimization and the general idea behind that paper was - that you have a lot of cases where there are spurious correlations and it's extremely easy to identify - those correlations for a machine learning model. It's very easy to identify those correlations and fit - to those correlations and not the actual target that we are trying to generalize to. And as a result of - that, it gets a very high accuracy, but it's not perfect. So you will get like 80% accuracy with a very, - very small parameter set, a lot of regularization functions actually promote using lesser parameters or - promote fewer parameters. So this is in fact a problem because spurious correlations are promoted in the - quest for generalization,  And as a result of that, a lot of corners are cut and it tends to identify - something that is completely different from the actual target. So a more practical example of that, the - Inventive Risk Minimization paper gave an example of cows and camels,  You will see cows in grasslands - 90% of the time and camels in deserts 90% of the time. So if we were to train a CNN on this sample of cows - on grasslands and cows in grasslands and camels in deserts, right there is a very high likelihood that it - will just identify the fact that grasslands are green and deserts are yellow and just on the basis of the - color of the image decide if it's a cow or a camel. So if you put a cow in a desert and a camel in a - grassland, it would predict it incorrectly every single time. So even though the general sort of - distribution of cows is more in grasslands and camels is more in deserts, we don't want it to identify - the color of the image because that's not actually learning anything about it. So they had a theoretical - approach towards fixing that or creating an invariant representation for a single object. So a cow should - have the same intermediate representation regardless of whether it's in a grassland or it is in a - desert. And it basically had a, they created a regularizer to do that. So that was definitely much too - complicated for me. So what I did was, in addition to that approach, I used interpretively to do the same - problem where I used class activation mappings to basically find a heat map of the region that was used to - identify the image.

-

 

-

Brian:

-

So can you explain what a class map activation is?

-

 

-

Jinen:

-

Oh yeah, of course, of course. So generally your CNN or convolutional neural network - works by creating patches of the image and then finding a rate map or a filter or a set of filters for the - one set of patches and that's one day. And you basically propagate through that

-

Now, it is very hard for us to identify based on the output of the dot product - between the actual patch and the filter and to identify what's happening. So I used the patch filter and - the bias to identify what's happening because that is the intermediary representation.

-

So a couple of researchers at MIT were able to find a way to get the actual heat map - of the detections that the CNN or the weight map that the CNN was using in order to base its - classifications. So for example, if I was identifying, if I was creating a human classifier and I put the, - like, I ran the image on myself, I would expect that this entire region, which is my body, should have a - high weight edge because it's trying to identify if the target is a person or a dog, let's say. So - it should not identify the background. If the background has a high weight edge it means it's not using - my features to make the classification. It's using the background to make the classification. So that - was the general idea where I created an intermediary layer that intrinsically created or generated this - class activation mapping.

-

From there, I created a loss function or a regularizer that would specifically ensure - that the focus of the image was the actual target and not the background. And once it was able to do that, I - was able to generalize it and get emergent learning, where the outer distribution error of the model was - reduced very significantly because we are not overfitting to the data thanks to the over parameterized - model.

-

 

-

Brian:

-

My next question is, before when you talked about interpretability, you said we have - to make this model interpretable,  So is there a scale of how interpretable a model is? Like a numeric - score? Or is it just like, it is interpretable or it's not interpretable or it's like medium - interpretable?

-

 

-

Jinen:

-

Well, I don't think there is an objective scale that is defined for - interpretability. However, there are definitely more interpretable models than others. So I guess you could, - if you really had to, you know, rank certain models on a scale of how interpretable they are versus how it, - how uninterpretable they are. If you look at something like a decision tree, that is generally one of the - most interpretable models because the reasons why it specifies the split is generally obvious. But the more - you sort of go through the depth of the model, the harder it is to realize why the decision tree made the - decision it does, even though we can sort of identify exactly what the splits are, we don't know why the - splits exist. We just know that they exist,  That's one example. Logistic regression is, or - logistical classifiers are mostly straightforward. Linear regressions are also very straightforward, more or - less the same thing. They have a series of weights and each class or each feature has a certain weight - associated with them. So we know that this feature has a very, very high weightage, which means it's - more important than the others. So this kind of lagging, but deep learning models, you cannot really - understand it. So it's not super interpretable

-

 

-

Brian:

-

It seems like it's really challenging to, you know, build these interpretable - models.

-

So if you're designing an interpretable model, are you designing an entirely new - architecture or something? Or are you just sort of making slight modifications to an existing algorithm or - can you do like both?

-

 

-

Jinen:

-

Both are generally kind of the way to go. So generally, if you were to sort of set up - intrinsic interpretability, if it is possible to do it just by sort of algorithm, that is generally not - required to change the architecture,  So if you're using a post-hoc technique or something that is - more intrinsic, then that's fine. And you don't have to update the architecture at all. But if you - wanted to promote the development of a certain behavior set, right, and you want the behavior set to be - specifically interpretable, maybe it is required to update the architecture a little bit. When I say update - the architecture, most of the time I generally mean just adding more layers or removing some layers or - creating a certain set of layers. And maybe once specifically for some invariant representation, for - example, but besides that, it generally stays the same. So the modifications are not major. It is mostly - editing the actual architecture itself.

-

 

-

Brian:

-

So currently, I hear transformers everywhere, you know, like transformers in chat - GPT, transformers for computer vision, and etc. So, these transformers use attention. That mechanism allows - them to focus on different parts of their input. So for example, like different parts of a word or different - parts of an image. And does that make the model interpretable since now we know exactly what parts of the - input it focuses or attends to?

-

 

-

Jinen:

-

So, well, this is actually an awesome question. So one of the sort of things that we - look at is or actually I'll go back to the analogy of the logistic regression versus the deep neural - network thing that I was talking about. So, we said that a logistic regression by itself is quite - interpretable because you have a series of bits and each feature has a bit associated to it

-

So if you want to know why you are making certain decisions, you go to the largest - weight and just go down from there basically, and you will be able to identify what features are making what - degree of impact. Now, it's interpretable to that extent. So if you want to make a deep neural network, - it is foundationally just a series of logistic regressors that interact with one another in order to - generate a more complicated hybrid being right and that's sort of the entire thing behind it that makes - it not interpretable. Attention works or like transformers with attention working the same in a similar way - where you have a lot of attention blocks that are stacked sequentially and sometimes even in parallel. And - what this means is that sometimes attention layers communicate information about the actual input that is - not related to things like the that is not immediately related to feature importance for the word - specifically. So for instance, there were a lot of situations where commas and separator tokens in general - had very, very high rates. And the reason for that was not because the separator or the generally irrelevant - token was important for this specific example, it was just trying to communicate something that was present - for the before like for the to the closer closer to the input of the model to something that is in like a - further later layer right and so I think attention with the attention mechanism at a single level, of - course, is very interpretable and I think it builds off intuition right? So we intuitively approach the or - build attention by saying we have the series of words and as a human reader, this is the way I would read it - so let me encode that you mathematically right that's kind of the approach that CNN's also used, - where we broke the image up into the neighborhood of important pixels, and then evaluated neighborhood by - neighborhood and we found that that was a better approach to learning than just giving it like feeding it - through an MLP because of the less number of parameters as well. So intuitive biases are super super - important, and they also boost interpretability which is awesome.

-

 

-

Brian:

-

So I wanted to move towards questions for beginners in machine learning, people who - are interested in it, because I know that there's a lot of beginners in the discord. So my question is, - how did you get into AI research at Purdue, like were you interested in it like in high school, or did you - really dig into it here?

-

 

-

Jinen:

-

I was very interested in research in general and high school by my interests were at - the time, the intersection of cybersecurity and machine learning. So I was really interested in binary - exploitation and to a certain extent cryptography and machine learning, had always been an interest and - those were those included like CV and MLP domains, but I also really wanted to do something at the - intersection of cybersecurity and machine learning simply because I was interested in that. So I reached out - to professors in the summer before I came to Purdue

-

I met with a professor and Antonio Bianchi, who redirected me to Professor Christina - Garmin, and I worked under the boilers applied cryptography lab for some time after that doing some research - at the intersection of the two which was fun.

-

 

-

Brian:

-

Did you take any classes related to AI or like cryphotography?

-

 

-

Jinen:

-

Oh, I read a book called Python machine learning by Sebastian last time. That was my - first introduction to machine learning. It was a fantastic book. Honestly, at this point I might not really - recommend anyone to use TensorFlow. But it's good. So in any case, before I deviate way too much, - reading books is like a super, super awesome way to get familiarized with concepts, because of the shared - amount of knowledge that they're able to sort of put in that small amount of space. I really like D2L.ai - dive into deep learning. So that's a fantastic resource. You can just put D2L ai in your browser and it - will give you answer basically every question you have about the most updated things in machine learning, - which is crazy.

-

 

-

Brian:

-

Okay, and my final question

-

There's like some background people right now

-

You know what, I'm just going to move

-

 

-

Jinen:

-

Okay, no problem

-

 

-

Brian:

-

Okay, my final question and I'm planning to to ask this to all future people I - interview

-

So it's like a really general question and kind of a basic one. But what do you - think the future of AI will look like? So, you know, it doesn't have to relate to interpretability or - your current research interests, just like in general.

-

 

-

Jinen:

-

Funny, I think that the question is a little bit more more political and then it is - political

-

I mean, not like the politics, politics, political, I mean more is it privately such - lab and industry versus you know stuff in academia style. There is a lot of very, very useful discourse that - genuinely I do not know enough to properly comment about, but the general idea is a lot of papers are being - published to Arxiv instead of like the instead of using Arxiv as a preprint server is just the final - destination of the of the actual paper and we can leave it at that

-

One example of the interpretable interpretable invariant risk minimization paper that - I was talking about that is paper that is just completely that paper is on Arxiv and that's right it is - not I don't think at least it is. And that's a problem because research methodology is is needs to - be verified for errors and it generally is it helps a lot. It helps the process right because if you get - trust the research methodology, it gives a lot of credibility to the process and we don't want - credibility coming from the name of the organization we wanted coming from the actual power of the research. - And that is a peer review for machine learning generally has been declining to a certain extent, where in a - lot of papers that are published today are very noisy and that is not to diminish the fact that we still - have so many so many amazing papers and but the majority of them are passed through peer review and are - published to major major majors. So it's important to set up and recognize this dichotomy. I don't - know what is going to happen with respect to it. If I had to take a wild guess, I would say not much - changes

-

Industry will continue to have a lot more compute and they will continue to push - things to Arxiv and call it a day. If they publish it at all the GPT for paper, if you can call it a paper, - they just put it on their website and left it at that. So as long as they don't call it research - that's not a problem with me. It's just a report. But they still do not provide that value back to - the world because there are reasons and everything that they do is completely commercialized right so that - kind of sucks. I will hope it became free and open source for everyone to use and research got democratized. - I'm not very optimistic unfortunately

-

 

-

Brian:

-

I asked all my questions and I mean you were a really good person to interview. So - thank you, Jinen.

-

 

-

Jinen:

-

Thank you so much. So yeah, I had a good time

-

 

-

 

-
- - - - - - + + +

AI Interpretibility with Jinen Setpal

+

September 6, 2023

+ +