Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new functionality + enhance performance #7

Open
man2machine opened this issue Oct 11, 2023 · 17 comments
Open

new functionality + enhance performance #7

man2machine opened this issue Oct 11, 2023 · 17 comments

Comments

@man2machine
Copy link

man2machine commented Oct 11, 2023

Assalamu alaikum

This looks great! May Allah (swt) reward you. If you need help with things related to AI/ML I am can do that in shaa Allah since I am experienced in that area. But I don't have much experience in the webdev area.

I find that the extension does not work in some cases, as you were saying in the issue in fushatech/tahir#15.

Perhaps it would be better to use a model that simply detects males or females and is not only a face detection model. I would prefer for myself for example to simply block all images of a certain gender instead of only just faces or nudity.

Second, is that I think it would be good, if we have an option to do the reverse of the way you set it up right now. This would improve performance. Instead of having a somewhat slow procedure to detect and blur out items, it could start as default blurred, and then unblur that based on the detection.

Open to collaboration! (Although I may not get significant time until a couple of weeks later)

@alganzory
Copy link
Owner

alganzory commented Oct 11, 2023

Wa'alaikum Assalam,

Thank you brother, may Allah bless you for your support and willing to help.

Perhaps it would be better to use a model that simply detects males or females and is not only a face detection model. I would prefer for myself for example to simply block all images of a certain gender instead of only just faces or nudity.

I would definetly much prefer a model that can just do what you just mentioned, I just couldn't find something out there that already does that out of the box and I don't have much experience with AI models let alone getting them to work with JS so I'd highly appreciate your help with that.

Second, is that I think it would be good, if we have an option to do the reverse of the way you set it up right now. This would improve performance. Instead of having a somewhat slow procedure to detect and blur out items, it could start as default blurred, and then unblur that based on the detection.

YES, this would be much better in fact, the user wouldn't notice the slow startup that's mainly caused by having to startup the models on every new tab (I don't know if there's a workaround to this in Manifest V3, but I am looking into it). So yeah, expect this feature as a toggle "Reverse mode" in the next version (in a few days inshallah)

Open to collaboration! (Although I may not get significant time until a couple of weeks later)

Thanks man, I'd really appreciate your help especially with finding the model that can do the job. May Allah reward you

@man2machine
Copy link
Author

man2machine commented Oct 13, 2023

Ok sounds good. I'll start looking into this in shaa Allah. I would like to know how you run the models on your end. Do you use TensorFlow JS? Or something else.

Second, I need to know: Do you resize images? Or how do you deal with different image sizes for the images? Does the model you currently use accept any sized image or only a specific size? And how do you resize it (like crop, squash, etc.) if you do resize it?

Finally for videos, do you basically capture the video as an image and run it through the model, or do you have a video model?

I need this information so that I can find or make a model that can work for the use case that we are discussing in shaa Allah. Depending on what you say, I may be able to find a model online that works or may need to train one for this specific purpose.

Please let me know! Jazak Allahu Khair.

@man2machine
Copy link
Author

I looked at your code and it seems you are using a library from https://github.com/vladmandic/human/wiki/Models
Which models are you using currently?

@alganzory
Copy link
Owner

@man2machine, thank you for your commitment to supporting the project, Yes as you pointed out I am currently relying on Human library that does most of the heavy lifting in terms of pre-processing the input, handling image and video inputs, it basically does all of that with a very neat API, it offers a bunch of models which you referenced but the ones I am currently using are:

  • Face Detection: MediaPipe BlazeFace Back variation / or front variation (back is better for cases when face isn't big and right in the center)
  • Face Description: HSE FaceRes , it's decent and lightweight but the accuracy of gender detection isn't impressive (at least in my findings)

But obviously there are some modifications to the original models to make them work with the library

The Human library does provide body detection as well via other BlazePose models but I am not using it at the moment cause it doesn't provide any value (there's no body gender classification).

So in a very ideal scenario, if we can extend the Human library and/or its models to provide a model that can do it all then that would be great as that would eliminate the need to use raw tensorflow js code to do all the pre-processing and processing, but that's also a very acceptable solution if we do find the right model.

I came across NudeNet and I think it's almost exactly what we are looking for, there's even a fork for it by the same developer of Human, but I haven't tried either of them yet.

@man2machine
Copy link
Author

Apologies for the lack of updates on this. As I mentioned earlier, I don't currently have time to work on this anytime soon, but as soon as the holidays come, when I have a break from studies, I should be able to contribute in shaa Allah. :)

@alganzory
Copy link
Owner

No worries, thanks for the support though.

@alganzory
Copy link
Owner

I came across NudeNet and I think it's almost exactly what we are looking for, there's even a fork for it by the same developer of Human, but I haven't tried either of them yet.

An update on this: I did try nudenet and the fork, they are very slow to be used in a browser extension, so far the current setup of nsfw + face detection is alright, the accuracy is acceptable except for some corner cases (people with head covers such as Sheik's (lol), children's age isn't always correctly estimated), so ideally the search for better face detection and recognition models that are also not too heavy for this environment should keep going inshallah

@man2machine
Copy link
Author

Assalamu alaikum @alganzory

It has been a long time since I got back to this. Apologies for the delays, I had some personal difficulties during the past few months so I couldn't work on this. However, my exams for this quarter end soon this week in shaa Allah, so I should be free more during this Ramadan.

Can you give me a quick rundown of what exact models you are using right now, and what API you are using to call them, so that I can see how to improve on this from an ML perspective? I looked at the code, and I am seeing different things like the human API, tfjs, etc. but it is unclear exactly what is being used right now and what you have tried.

Jazak Allahu khair.

@alganzory
Copy link
Owner

Assalamu alaikum @alganzory

It has been a long time since I got back to this. Apologies for the delays, I had some personal difficulties during the past few months so I couldn't work on this. However, my exams for this quarter end soon this week in shaa Allah, so I should be free more during this Ramadan.

Can you give me a quick rundown of what exact models you are using right now, and what API you are using to call them, so that I can see how to improve on this from an ML perspective? I looked at the code, and I am seeing different things like the human API, tfjs, etc. but it is unclear exactly what is being used right now and what you have tried.

Jazak Allahu khair.

Walaikum Asalam,

So basically the project uses tensorflow.js through a wrapper library called human.js, I used that library because it also provides some models and very nice apis, the project uses 2 of the models it provides one for face detection called blazeface, and the other is for face description (classifcation?) called faceres, these two models are the ones used for the gender classification part. For the NSFW detection, we use NSFW.js, it's a tensorflow.js model so it works with this setup.

Here's what's good and bad about these models:

  • Blazeface + faceres :
    Pros: relatively fast and lightweight, quite accurate, gives face box boundaries so I can use it to draw the face detected
    Cons: gender classification by face only, so if the face angle isn't good or if there's "haram" content without a face, no detecto

-NSFW model:
Pros: very lightweight and fast, gives classes like "Porn", "Sexy", "Neutral", "Hentai".
Cons: the accuracy isn't all that great, and it has a very weird issue with hands, almost any picture of hands will be flagged as NSFW, also sometimes animals. I'd say this model is mainly the cause of false positives. It's also just a classification model and not detection, so it doesn't give info that I can use to draw

Ideally we want to have our own model that does both detection and classification, I found https://github.com/notAI-tech/NudeNet I tried their browser demo and I think I tried using the model in the extension before but it was very slow, so if we can have something like this (cause it does nudity and face detection and classification) then we would have it figured out.

I'd really appreciate your support, jazak allahu khairan

@man2machine
Copy link
Author

Wa iyyakum.

I'm doing my research on how to potentially create our on model that would meet our specific needs, or combine other models to meet our requirements.

In terms of using the 3 models you described blazeface, faceres and nsfwjs, I am assuming all of these models take image inputs of a fixed size. However, the images on a website are of various sizes of course, so how do you resize/crop images before you feed it into these models? This is important for training a new model since we want the training procedure to follow the same method of resizing/cropping that you use in the extension.

@alganzory
Copy link
Owner

alganzory commented Mar 21, 2024 via email

@man2machine
Copy link
Author

Thanks for letting me know. I meant to ask how exactly you resize the image. For example do you simply stretch/squish the image (so a wide image would look squished), or do you shrink the image and maintain the aspect ratio by keeping black borders around the image? Or do you crop the image and maintain the aspect ratio?

@man2machine
Copy link
Author

man2machine commented Mar 21, 2024

The first priority I have is to create a model that classifies images without any detection bounding box, but is accurate, and based on any part of the body, not just the face. After I get that, I would move on to creating a model that provides the bounding boxes as well. If nsfw is detected, it should just blur the entire image.

The most difficult part it seems right now is to obtain the necessary training data, since currently it appears that the training data for blazeface is not disclosed. The training data for faceres and the nsfw model is available however.

Between blazeface and faceres, which of the models is the most performant? Or are they used for different purposes? I saw online that BlazeFace only returns the face location and not the gender, so do you crop out the face using BlazeFace, and then use FaceRes on the cropped face-only image to determine the gender?

And also are you using the blazeface front or back model? In the assets, I see it just says blazeface instead of blazefaze-back or blazeface-front.

@alganzory
Copy link
Owner

The first priority I have is to create a model that classifies images without any detection bounding box, but is accurate, and based on any part of the body, not just the face. After I get that, I would move on to creating a model that provides the bounding boxes as well. If nsfw is detected, it should just blur the entire image.

The most difficult part it seems right now is to obtain the necessary training data, since currently it appears that the training data for blazeface is not disclosed. The training data for faceres and the nsfw model is available however.

Between blazeface and faceres, which of the models is the most performant? Or are they used for different purposes? I saw online that BlazeFace only returns the face location and not the gender, so do you crop out the face using BlazeFace, and then use FaceRes on the cropped face-only image to determine the gender?

And also are you using the blazeface front or back model? In the assets, I see it just says blazeface instead of blazefaze-back or blazeface-front.

Blazeface is for the detection, faceres does the description part, the process of cutting out the face then running faceres is entirely handled by the library (human.js).
I am using blazeface back which is a little slower than blazeface front but it's a bit more accurate especially for videos

As for the training data, I'd say we need data that's not limited to faces since our classification should, as you said, run on the body not face.
Have you gotten the chance to look at nudenet? https://github.com/notAI-tech/NudeNet/
I also came across this: https://github.com/nghorbani/homogenus I feel like it's what we want but I haven't taken a deeper look

@man2machine
Copy link
Author

The second repo you linked looks promising. The output of the model however isn't exactly what we want as it tries to generate a 3d model of a human from the 2d image. The model is thus probably a lot more expensive to run and gives us information we don't need. However, what is useful from that link is the training data they used.

I think the detection needs to be improved so that it isn't reliant on either the face or the whole body, it should work with parts of the body, different angles, real people vs. drawings, etc. This is something that can be done as long as the training data is available, even if the model isn't available or doesn't output exactly what we want. If I can find the appropriate training data from various links and put them together, I can write the code in shaa Allah, to train a model the size of the BlazeFace model, or MobileNetV3, both of which are appropriate for this extension.

So right now is basically the research stage where I just go on the internet and try to put together all the possible training data & models I can get. If you find any other links like this please let me know!

Jazak Allahu Khair

@alganzory
Copy link
Owner

@man2machine

that's good to hear, I feel like it would make sense to start a new repo just for work on this, I created this one to save these discussions and findings there and maybe code as well when we get to that stage

@man2machine
Copy link
Author

Assalamu alaikum

Just wanted to share how your extension is making rounds amongst Muslims online: https://muslimskeptic.com/2024/02/15/guarding-gaze/

Allahu akbar!

Jazak Allahu khayr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants