Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculating number of linear regions #10

Closed
YiteWang opened this issue May 28, 2021 · 6 comments
Closed

Calculating number of linear regions #10

YiteWang opened this issue May 28, 2021 · 6 comments

Comments

@YiteWang
Copy link

YiteWang commented May 28, 2021

Dear authors,

I am having a question for calculating number of linear regions. It seems that in TE-NAS, input images are augmented to be of size (1000,1,3,3):
lrc_model = Linear_Region_Collector(input_size=(1000, 1, 3, 3), sample_batch=3, dataset=xargs.dataset, data_path=xargs.data_path, seed=xargs.rand_seed)

Could you explain what the reason is behind this?

@taoyang1122
Copy link

Got a similar question here. Is this to reduce memory cost?

@chenwydj
Copy link
Collaborator

chenwydj commented Jun 3, 2021

Hi @YiteWang and @taoyang1122

Thank you for your interest in our work!

Deep NNs typically have a very large number of linear regions. The larger the input dimension is (e.g. larger than 1x3x3), the more likely the input samples are separated into different linear regions. That means, if we use a larger input size and forward 3000 samples, we may end up with #Linear_Regions = 3000 (i.e. all input samples reside in different linear regions) for all NNs in the search space. To put it another way, reducing the input dimension makes the expressivities of different NNs more distinguishable.

Hope that helps!

@taoyang1122
Copy link

@chenwydj Thanks very much! I am trying to reproduce the results on ImageNet. But I found the darts_evaluation to be very slow. It is going to take more than 4 days to train 350 epochs on 8 GPUs. Is this the same on your side? Thanks!

@maryanpetruk
Copy link

@chenwydj I am having the similar issue as @taoyang1122 . Could you please let us know what hardware have you used to train found architecture on darts space on ImageNet with "batch_size = 768" on "8-gpu"?

@chenwydj
Copy link
Collaborator

Hi @taoyang1122 and @maryanpetruk,

We used V100 GPU to train ImageNet. It is true that training-from-scratch on ImageNet is slow: 4~5 days are very common.

@Arioll
Copy link

Arioll commented Jun 14, 2021

Hello, I'm trying to calculate number of linear regions but the linear region collector always returns number of dimensions (the same number for all networks).

        @torch.no_grad()
        def update2D(self, activations):
  
            n_batch = activations.size()[0]
  
            n_neuron = activations.size()[1]
  
            self.n_neuron = n_neuron
  
            if self.activations is None:
  
                self.activations = torch.zeros(self.n_samples, n_neuron).cuda()
  
            self.activations[self.ptr:self.ptr+n_batch] = torch.sign(activations)  # after ReLU
  
            self.ptr += n_batch
  
  
        @torch.no_grad()
        def calc_LR(self):
  
       (1) res = torch.matmul(self.activations.half(), (1-self.activations).T.half()) # each element in res: A * (1 - B)
  
       (2) res += res.T # make symmetric, each element in res: A * (1 - B) + (1 - A) * B, a non-zero element indicate a pair of two different linear regions
  
       (3) res = 1 - torch.sign(res) # a non-zero element now indicate two linear regions are identical
  
            res = res.sum(1) # for each sample's linear region: how many identical regions from other samples
  
            res = 1. / res.float() # contribution of each redudant (repeated) linear region
  
            self.n_LR = res.sum().item() # sum of unique regions (by aggregating contribution of all regions)
  
            del self.activations, res
  
            self.activations = None

Here are functions from lib/procedures/linear_region_counter.py which computes number of linear regions. However, as far as I understand, if we have ReLU network, then after each layer we have non negative outputs so torch.sign will return 0 or 1. So, according to the first function, self.activations matrix will contain only 0 or 1 as it's elements. In this case result of the lines (1), (2) and (3) will always be an identity matrix. Could you explain the idea behind the algorithm? Thank you

@YiteWang YiteWang closed this as completed Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants